Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Scrapy
Search
Lucas Hiago de Moura Vilela
November 30, 2019
Programming
0
36
Introduction to Scrapy
My talk about the framework python-based Scrapy.
Lucas Hiago de Moura Vilela
November 30, 2019
Tweet
Share
More Decks by Lucas Hiago de Moura Vilela
See All by Lucas Hiago de Moura Vilela
SQL com Arel no Rails
luchiago
0
27
Brown Bag - Aplicação mobile de vídeo-chamadas
luchiago
0
48
Gitpod
luchiago
1
70
Design pattern Adapter
luchiago
0
35
Other Decks in Programming
See All in Programming
re:Invent 2025 のイケてるサービスを紹介する
maroon1st
0
180
AgentCoreとHuman in the Loop
har1101
5
200
2年のAppleウォレットパス開発の振り返り
muno92
PRO
0
190
副作用をどこに置くか問題:オブジェクト指向で整理する設計判断ツリー
koxya
1
560
Claude Codeの「Compacting Conversation」を体感50%減! CLAUDE.md + 8 Skills で挑むコンテキスト管理術
kmurahama
1
830
AIフル活用時代だからこそ学んでおきたい働き方の心得
shinoyu
0
120
実は歴史的なアップデートだと思う AWS Interconnect - multicloud
maroon1st
0
370
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
190
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
160
[AI Engineering Summit Tokyo 2025] LLMは計画業務のゲームチェンジャーか? 最適化業務における活⽤の可能性と限界
terryu16
2
590
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
360
なぜSQLはAIぽく見えるのか/why does SQL look AI like
florets1
0
410
Featured
See All Featured
Writing Fast Ruby
sferik
630
62k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.2k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
Docker and Python
trallard
47
3.7k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.7k
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
0
2.3k
So, you think you're a good person
axbom
PRO
2
1.9k
The World Runs on Bad Software
bkeepers
PRO
72
12k
Balancing Empowerment & Direction
lara
5
860
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
200
The Limits of Empathy - UXLibs8
cassininazir
1
210
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
110
Transcript
Introdução ao Scrapy Uma ferramenta para web scraping
$ whoami > Estagiário na empresa CodeMiner42 > Back-end developer
no projeto Colaboradados > Graduando em Ciência da Computação pela UFPI > Entusiasta da linguagem Python > Aventurando nas trilhas do Ruby on Rails /luchiago /luchiago
A mercadoria mais valiosa do mundo após o tempo são
os dados.
Como obter esses dados? > Interface de Programação de Aplicativos
> Requisições HTTP GET THEM ALL
E quando o site não fornece uma API?
Crawlers vs Scraping
Colaborabot http://colaboradados.com.br/bot_colaboradados.html https://twitter.com/colabora_bot
Web Scraping: problemas > Bloqueio de endereço IP > robots.txt
> HTML mal estruturado
Scrapy “Uma framework open source e colaborativa para extração dos
dados que você precisa dos websites, em uma maneira rápida, simples e escalável” https://scrapy.org/
Tecnologias semelhantes em Python Beautiful Soup https://www.crummy.com/software/BeautifulS oup/bs4/doc/ Selenium https://selenium-python.readthedocs.io/
Requests https://2.python-requests.org//en/master/
City Scrapers
Obrigado!