Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to collect large scale data using Javascript
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Leonardo Rifeli
June 02, 2022
Programming
0
60
How to collect large scale data using Javascript
Leonardo Rifeli
June 02, 2022
Tweet
Share
More Decks by Leonardo Rifeli
See All by Leonardo Rifeli
Acate: Processamento distribuído - Como processamos milhões de dados diariamente
leonardorifeli
0
32
Reviewr Data Consolidation Case
leonardorifeli
0
120
Distributed processing: How we process millions of data daily with EMR
leonardorifeli
0
51
Building Crawlers with serverless
leonardorifeli
0
79
Other Decks in Programming
See All in Programming
エンジニアの「手元の自動化」を加速するn8n 2026.02.27
symy2co
0
170
存在論的プログラミング: 時間と存在を記述する
koriym
3
340
maplibre-gl-layers - 地図に移動体たくさん表示したい
kekyo
PRO
0
390
DevinとClaude Code、SREの現場で使い倒してみた件
karia
1
1.1k
CS教育のDX AIによる育成の効率化
niftycorp
PRO
0
150
Laravel Nightwatchの裏側 - Laravel公式Observabilityツールを支える設計と実装
avosalmon
1
140
Ruby and LLM Ecosystem 2nd
koic
1
1.2k
ベクトル検索のフィルタを用いた機械学習モデルとの統合 / python-meetup-fukuoka-06-vector-attr
monochromegane
2
500
野球解説AI Agentを開発してみた - 2026/02/27 LayerX社内LT会資料
shinyorke
PRO
0
350
AIコードレビューの導入・運用と AI駆動開発における「AI4QA」の取り組みについて
hagevvashi
0
540
Claude Code Skill入門
mayahoney
0
410
Understanding Apache Lucene - More than just full-text search
spinscale
0
140
Featured
See All Featured
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
110k
Building the Perfect Custom Keyboard
takai
2
720
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
89
Git: the NoSQL Database
bkeepers
PRO
432
67k
How to Build an AI Search Optimization Roadmap - Criteria and Steps to Take #SEOIRL
aleyda
1
2k
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
360
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.1k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
Code Review Best Practice
trishagee
74
20k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.8k
Transcript
How to collect large scale data using Javascript seo local
| reviews | pesquisas
None
Agora a experiência é o novo marketing
Somos a Harmo, a plataforma de marketing de experiência mais
completa do Brasil.
SEO Local A única plataforma 3 x 1 do Brasil
Faça a gestão da presença digital da sua rede de lojas e seja encontrado no topo do ranking das pesquisas de forma 100% orgânica. 1 2 3
Reviews A única plataforma 3 x 1 do Brasil Colete,
analise e responda todos os reviews dos seus clientes, conquiste a confiança do consumidor e seja a marca escolhida. 1 2 3
A única plataforma 3 x 1 do Brasil Pesquisas multimétricas
para medir a experiência do cliente durante toda a jornada. Identifique promotores e ative o programa de indicação de reviews. Pesquisas 1 2 3
Harmo, uma poderosa máquina de geração de ROI. Escute, interaja,
analise e atue focado nos anseios dos clientes, durante toda a jornada, transformando os seus clientes no principal canal de aquisição de novos clientes.
Grandes marcas atestam a qualidade da nossa plataforma e metodologia
com foco em resultados
NUMBERS Establishments +30k Reviews +15kk Integrations +54k Emails +6,6kk SMS
+250k Answer of Review +1kk
▷ Distributed Process ▷ Scrapping vs Crawlers ▷ Some Concepts
▷ Why Javascript? ▷ Architecture for Scale ▷ Lessons Learning ▷ Example ▷ Conclusion Topics
Distributed Process
None
Scraping vs Crawlers
None
Collector Concepts
Be "Browserless"
Recursion is your friends
Single Responsability
Normalize Data (input & output)
Code reuse with packages
Collector !== Processor
Why Javascript?
Use native streams
Dynamic typing
Do more with less
Most used in the world
Architecture for Scale
None
None
None
Lessons Learning
Use code-base version alert
Code reuse with packages
Create E2E tests from the begin
Be "Browserless"
Use Puppeteer *reduce images
None
None
Use Promise.all
None
Use monorepos
Otherwise it will be chaos
None
None
▷ Web Scraping vs Web Crawling: The Differences ▷ HOW
TO RUN ASYNC JAVASCRIPT FUNCTIONS IN SEQUENCE OR PARALLEL Links
Collector Example
None
None
Leonardo Rifeli | CTO
[email protected]
harmo.me seo local | reviews
| pesquisas