Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
36
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
27
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
48
Gitpod
luchiago
1
70
Design pattern Adapter
luchiago
0
35
Other Decks in Programming
See All in Programming
Angular-Apps smarter machen mit Gen AI: Lokal und offlinefähig - Hands-on Workshop!
christianliebel
PRO
0
130
AI時代の脳疲弊と向き合う ~言語学としてのPHP~
sakuraikotone
1
1.4k
米国のサイバーセキュリティタイムラインと見る Goの暗号パッケージの進化
tomtwinkle
2
630
仕様漏れ実装漏れをなくすトレーサビリティAI基盤のご紹介
orgachem
PRO
7
2.8k
20260315 AWSなんもわからん🥲
chiilog
2
170
条件判定に名前、つけてますか? #phperkaigi #c
77web
2
620
Fundamentals of Software Engineering In the Age of AI
therealdanvega
2
280
車輪の再発明をしよう!PHP で実装して学ぶ、Web サーバーの仕組みと HTTP の正体
h1r0
2
240
社内規程RAGの精度を73.3% → 100%に改善した話
oharu121
13
8.2k
Takumiから考えるSecurity_Maturity_Model.pdf
gessy0129
1
150
20260313 - Grafana & Friends Taipei #1 - Kubernetes v1.36 的開發雜記:那些困在 Alpha 加護病房太久的 Metrics
tico88612
0
220
コーディングルールの鮮度を保ちたい / keep-fresh-go-internal-conventions
handlename
0
220
Featured
See All Featured
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
790
Max Prin - Stacking Signals: How International SEO Comes Together (And Falls Apart)
techseoconnect
PRO
0
120
Design in an AI World
tapps
0
180
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
160
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.5k
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
200
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
250
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
200
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Facilitating Awesome Meetings
lara
57
6.8k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!