Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
33
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
26
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
46
Gitpod
luchiago
1
67
Design pattern Adapter
luchiago
0
34
Other Decks in Programming
See All in Programming
AIを活用し、今後に備えるための技術知識 / Basic Knowledge to Utilize AI
kishida
21
5.6k
複雑なドメインに挑む.pdf
yukisakai1225
5
1.1k
そのAPI、誰のため? Androidライブラリ設計における利用者目線の実践テクニック
mkeeda
2
260
testingを眺める
matumoto
1
140
@Environment(\.keyPath)那么好我不允许你们不知道! / atEnvironment keyPath is so good and you should know it!
lovee
0
110
プロパティベーステストによるUIテスト: LLMによるプロパティ定義生成でエッジケースを捉える
tetta_pdnt
0
300
JSONataを使ってみよう Step Functionsが楽しくなる実践テクニック #devio2025
dafujii
1
510
Updates on MLS on Ruby (and maybe more)
sylph01
1
180
Namespace and Its Future
tagomoris
6
700
もうちょっといいRubyプロファイラを作りたい (2025)
osyoyu
0
400
HTMLの品質ってなんだっけ? “HTMLクライテリア”の設計と実践
unachang113
4
2.7k
RDoc meets YARD
okuramasafumi
4
170
Featured
See All Featured
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
840
GraphQLとの向き合い方2022年版
quramy
49
14k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.9k
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
13k
Building Applications with DynamoDB
mza
96
6.6k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Building an army of robots
kneath
306
46k
Typedesign – Prime Four
hannesfritz
42
2.8k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!