Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
36
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
27
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
48
Gitpod
luchiago
1
70
Design pattern Adapter
luchiago
0
35
Other Decks in Programming
See All in Programming
浮動小数の比較について
kishikawakatsumi
0
370
CSC307 Lecture 12
javiergs
PRO
0
450
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
370
AIに仕事を丸投げしたら、本当に楽になれるのか
dip_tech
PRO
0
180
API Platformを活用したPHPによる本格的なWeb API開発 / api-platform-book-intro
ttskch
1
110
クライアントワークでSREをするということ。あるいは事業会社におけるSREと同じこと・違うこと
nnaka2992
1
290
What Spring Developers Should Know About Jakarta EE
ivargrimstad
0
160
2026/02/04 AIキャラクター人格の実装論 口 調の模倣から、コンテキスト制御による 『思想』と『行動』の創発へ
sr2mg4
0
670
オブザーバビリティ駆動開発って実際どうなの?
yohfee
3
660
「ブロックテーマでは再現できない」は本当か?
inc2734
0
1.1k
DevinとClaude Code、SREの現場で使い倒してみた件
karia
1
860
日本だけで解禁されているアプリ起動の方法
ryunakayama
0
360
Featured
See All Featured
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
140
The Limits of Empathy - UXLibs8
cassininazir
1
240
Building an army of robots
kneath
306
46k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.1k
The browser strikes back
jonoalderson
0
750
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
190
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.6k
Exploring anti-patterns in Rails
aemeredith
2
280
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
140
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.7k
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!