Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
34
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
26
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
46
Gitpod
luchiago
1
68
Design pattern Adapter
luchiago
0
34
Other Decks in Programming
See All in Programming
CSC509 Lecture 11
javiergs
PRO
0
310
アーキテクチャと考える迷子にならない開発者テスト
irof
9
3.2k
Vueで学ぶデータ構造入門 リンクリストとキューでリアクティビティを捉える / Vue Data Structures: Linked Lists and Queues for Reactivity
konkarin
1
320
Java_プロセスのメモリ監視の落とし穴_NMT_で見抜けない_glibc_キャッシュ問題_.pdf
ntt_dsol_java
0
220
[SF Ruby Conf 2025] Rails X
palkan
0
240
Patterns of Patterns (and why we need them)
denyspoltorak
0
110
JEP 496 と JEP 497 から学ぶ耐量子計算機暗号入門 / Learning Post-Quantum Crypto Basics from JEP 496 & 497
mackey0225
2
450
Honoを技術選定したAI要件定義プラットフォームAcsimでの意思決定
codenote
0
250
2026年向け会社紹介資料
misu
0
250
目的で駆動する、AI時代のアーキテクチャ設計 / purpose-driven-architecture
minodriven
9
3k
DartASTとその活用
sotaatos
2
140
競馬で学ぶ機械学習の基本と実践 / Machine Learning with Horse Racing
shoheimitani
14
13k
Featured
See All Featured
Leading Effective Engineering Teams in the AI Era
addyosmani
8
1.1k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Code Review Best Practice
trishagee
72
19k
Side Projects
sachag
455
43k
Context Engineering - Making Every Token Count
addyosmani
9
410
Making Projects Easy
brettharned
120
6.5k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Unsuck your backbone
ammeep
671
58k
Automating Front-end Workflow
addyosmani
1371
200k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!