Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
34
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
26
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
46
Gitpod
luchiago
1
68
Design pattern Adapter
luchiago
0
34
Other Decks in Programming
See All in Programming
組込みだけじゃない!TinyGo で始める無料クラウド開発入門
otakakot
2
380
登壇は dynamic! な営みである / speech is dynamic
da1chi
0
390
Blazing Fast UI Development with Compose Hot Reload (Bangladesh KUG, October 2025)
zsmb
2
420
O Que É e Como Funciona o PHP-FPM?
marcelgsantos
0
230
釣り地図SNSにおける有料機能の実装
nokonoko1203
0
200
外接に惑わされない自システムの処理時間SLIをOpenTelemetryで実現した話
kotaro7750
0
120
Migration to Signals, Resource API, and NgRx Signal Store
manfredsteyer
PRO
0
130
EMこそClaude Codeでコード調査しよう
shibayu36
0
490
Dive into Triton Internals
appleparan
0
230
Researchlyの開発で参考にしたデザイン
adsholoko
0
100
Devoxx BE - Local Development in the AI Era
kdubois
0
150
ボトムアップの生成AI活用を推進する社内AIエージェント開発
aku11i
0
1.1k
Featured
See All Featured
Visualization
eitanlees
150
16k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
2
180
A better future with KSS
kneath
239
18k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
A designer walks into a library…
pauljervisheath
209
24k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
Leading Effective Engineering Teams in the AI Era
addyosmani
7
670
Writing Fast Ruby
sferik
630
62k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!