Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
36
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
27
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
48
Gitpod
luchiago
1
70
Design pattern Adapter
luchiago
0
35
Other Decks in Programming
See All in Programming
実は歴史的なアップデートだと思う AWS Interconnect - multicloud
maroon1st
0
370
AIエージェント、”どう作るか”で差は出るか? / AI Agents: Does the "How" Make a Difference?
rkaga
4
1.9k
責任感のあるCloudWatchアラームを設計しよう
akihisaikeda
3
130
AI Agent の開発と運用を支える Durable Execution #AgentsInProd
izumin5210
7
2.2k
余白を設計しフロントエンド開発を 加速させる
tsukuha
7
2.1k
AIフル活用時代だからこそ学んでおきたい働き方の心得
shinoyu
0
120
AI によるインシデント初動調査の自動化を行う AI インシデントコマンダーを作った話
azukiazusa1
1
610
AI Agent Tool のためのバックエンドアーキテクチャを考える #encraft
izumin5210
6
1.7k
Fragmented Architectures
denyspoltorak
0
140
AgentCoreとHuman in the Loop
har1101
5
200
.NET Conf 2025 の興味のあるセッ ションを復習した / dotnet conf 2025 quick recap for backend engineer
tomohisa
0
120
2年のAppleウォレットパス開発の振り返り
muno92
PRO
0
190
Featured
See All Featured
Speed Design
sergeychernyshev
33
1.5k
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
66
36k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
280
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
260
Exploring anti-patterns in Rails
aemeredith
2
230
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
Docker and Python
trallard
47
3.7k
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
3.9k
Designing for Performance
lara
610
70k
Building Flexible Design Systems
yeseniaperezcruz
330
40k
Optimizing for Happiness
mojombo
379
71k
The Invisible Side of Design
smashingmag
302
51k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!