Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to collect large scale data using Javascript
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Leonardo Rifeli
June 02, 2022
Programming
0
59
How to collect large scale data using Javascript
Leonardo Rifeli
June 02, 2022
Tweet
Share
More Decks by Leonardo Rifeli
See All by Leonardo Rifeli
Acate: Processamento distribuído - Como processamos milhões de dados diariamente
leonardorifeli
0
31
Reviewr Data Consolidation Case
leonardorifeli
0
120
Distributed processing: How we process millions of data daily with EMR
leonardorifeli
0
50
Building Crawlers with serverless
leonardorifeli
0
78
Other Decks in Programming
See All in Programming
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
190
CSC307 Lecture 06
javiergs
PRO
0
680
そのAIレビュー、レビューしてますか? / Are you reviewing those AI reviews?
rkaga
6
4.5k
AIによる開発の民主化を支える コンテキスト管理のこれまでとこれから
mulyu
3
190
AIによるイベントストーミング図からのコード生成 / AI-powered code generation from Event Storming diagrams
nrslib
2
1.9k
今こそ知るべき耐量子計算機暗号(PQC)入門 / PQC: What You Need to Know Now
mackey0225
3
370
QAフローを最適化し、品質水準を満たしながらリリースまでの期間を最短化する #RSGT2026
shibayu36
2
4.3k
AI によるインシデント初動調査の自動化を行う AI インシデントコマンダーを作った話
azukiazusa1
1
710
IFSによる形状設計/デモシーンの魅力 @ 慶應大学SFC
gam0022
1
300
2026年 エンジニアリング自己学習法
yumechi
0
130
フロントエンド開発の勘所 -複数事業を経験して見えた判断軸の違い-
heimusu
7
2.8k
Fragmented Architectures
denyspoltorak
0
150
Featured
See All Featured
Context Engineering - Making Every Token Count
addyosmani
9
650
Designing Powerful Visuals for Engaging Learning
tmiket
0
230
So, you think you're a good person
axbom
PRO
2
1.9k
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
The browser strikes back
jonoalderson
0
360
Building Experiences: Design Systems, User Experience, and Full Site Editing
marktimemedia
0
410
Producing Creativity
orderedlist
PRO
348
40k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
89
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
160
Build The Right Thing And Hit Your Dates
maggiecrowley
38
3k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
120
Transcript
How to collect large scale data using Javascript seo local
| reviews | pesquisas
None
Agora a experiência é o novo marketing
Somos a Harmo, a plataforma de marketing de experiência mais
completa do Brasil.
SEO Local A única plataforma 3 x 1 do Brasil
Faça a gestão da presença digital da sua rede de lojas e seja encontrado no topo do ranking das pesquisas de forma 100% orgânica. 1 2 3
Reviews A única plataforma 3 x 1 do Brasil Colete,
analise e responda todos os reviews dos seus clientes, conquiste a confiança do consumidor e seja a marca escolhida. 1 2 3
A única plataforma 3 x 1 do Brasil Pesquisas multimétricas
para medir a experiência do cliente durante toda a jornada. Identifique promotores e ative o programa de indicação de reviews. Pesquisas 1 2 3
Harmo, uma poderosa máquina de geração de ROI. Escute, interaja,
analise e atue focado nos anseios dos clientes, durante toda a jornada, transformando os seus clientes no principal canal de aquisição de novos clientes.
Grandes marcas atestam a qualidade da nossa plataforma e metodologia
com foco em resultados
NUMBERS Establishments +30k Reviews +15kk Integrations +54k Emails +6,6kk SMS
+250k Answer of Review +1kk
▷ Distributed Process ▷ Scrapping vs Crawlers ▷ Some Concepts
▷ Why Javascript? ▷ Architecture for Scale ▷ Lessons Learning ▷ Example ▷ Conclusion Topics
Distributed Process
None
Scraping vs Crawlers
None
Collector Concepts
Be "Browserless"
Recursion is your friends
Single Responsability
Normalize Data (input & output)
Code reuse with packages
Collector !== Processor
Why Javascript?
Use native streams
Dynamic typing
Do more with less
Most used in the world
Architecture for Scale
None
None
None
Lessons Learning
Use code-base version alert
Code reuse with packages
Create E2E tests from the begin
Be "Browserless"
Use Puppeteer *reduce images
None
None
Use Promise.all
None
Use monorepos
Otherwise it will be chaos
None
None
▷ Web Scraping vs Web Crawling: The Differences ▷ HOW
TO RUN ASYNC JAVASCRIPT FUNCTIONS IN SEQUENCE OR PARALLEL Links
Collector Example
None
None
Leonardo Rifeli | CTO
[email protected]
harmo.me seo local | reviews
| pesquisas