Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
36
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
27
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
48
Gitpod
luchiago
1
70
Design pattern Adapter
luchiago
0
35
Other Decks in Programming
See All in Programming
脳の「省エネモード」をデバッグする ~System 1(直感)と System 2(論理)の切り替え~
panda728
PRO
0
130
大規模Cloud Native環境におけるFalcoの運用
owlinux1000
0
230
AtCoder Conference 2025「LLM時代のAHC」
imjk
2
610
Java 25, Nuevas características
czelabueno
0
120
PC-6001でPSG曲を鳴らすまでを全部NetBSD上の Makefile に押し込んでみた / osc2025hiroshima
tsutsui
0
200
開発に寄りそう自動テストの実現
goyoki
2
1.6k
Developing static sites with Ruby
okuramasafumi
0
330
Patterns of Patterns
denyspoltorak
0
390
LLMで複雑な検索条件アセットから脱却する!! 生成的検索インタフェースの設計論
po3rin
4
1k
Graviton と Nitro と私
maroon1st
0
150
実はマルチモーダルだった。ブラウザの組み込みAI🧠でWebの未来を感じてみよう #jsfes #gemini
n0bisuke2
3
1.3k
Denoのセキュリティに関する仕組みの紹介 (toranoana.deno #23)
uki00a
0
190
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
340
58k
The Pragmatic Product Professional
lauravandoore
37
7.1k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
286
14k
Chasing Engaging Ingredients in Design
codingconduct
0
92
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
120
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
69
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
The Invisible Side of Design
smashingmag
302
51k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.4k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!