Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to collect large scale data using Javascript
Search
Leonardo Rifeli
June 02, 2022
Programming
0
59
How to collect large scale data using Javascript
Leonardo Rifeli
June 02, 2022
Tweet
Share
More Decks by Leonardo Rifeli
See All by Leonardo Rifeli
Acate: Processamento distribuído - Como processamos milhões de dados diariamente
leonardorifeli
0
29
Reviewr Data Consolidation Case
leonardorifeli
0
120
Distributed processing: How we process millions of data daily with EMR
leonardorifeli
0
49
Building Crawlers with serverless
leonardorifeli
0
78
Other Decks in Programming
See All in Programming
AI 駆動開発ライフサイクル(AI-DLC):ソフトウェアエンジニアリングの再構築 / AI-DLC Introduction
kanamasa
11
3.5k
マスタデータ問題、マイクロサービスでどう解くか
kts
0
110
まだ間に合う!Claude Code元年をふりかえる
nogu66
5
880
Cell-Based Architecture
larchanjo
0
140
perlをWebAssembly上で動かすと何が嬉しいの??? / Where does Perl-on-Wasm actually make sense?
mackee
0
110
GISエンジニアから見たLINKSデータ
nokonoko1203
0
180
Navigating Dependency Injection with Metro
l2hyunwoo
1
170
ELYZA_Findy AI Engineering Summit登壇資料_AIコーディング時代に「ちゃんと」やること_toB LLMプロダクト開発舞台裏_20251216
elyza
2
550
Combinatorial Interview Problems with Backtracking Solutions - From Imperative Procedural Programming to Declarative Functional Programming - Part 2
philipschwarz
PRO
0
110
AIの誤りが許されない業務システムにおいて“信頼されるAI” を目指す / building-trusted-ai-systems
yuya4
6
3.9k
Full-Cycle Reactivity in Angular: SignalStore mit Signal Forms und Resources
manfredsteyer
PRO
0
160
Findy AI+の開発、運用におけるMCP活用事例
starfish719
0
1.7k
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
49
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
66
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
200
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Deep Space Network (abreviated)
tonyrice
0
20
Done Done
chrislema
186
16k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
25
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
90
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
3.4k
Transcript
How to collect large scale data using Javascript seo local
| reviews | pesquisas
None
Agora a experiência é o novo marketing
Somos a Harmo, a plataforma de marketing de experiência mais
completa do Brasil.
SEO Local A única plataforma 3 x 1 do Brasil
Faça a gestão da presença digital da sua rede de lojas e seja encontrado no topo do ranking das pesquisas de forma 100% orgânica. 1 2 3
Reviews A única plataforma 3 x 1 do Brasil Colete,
analise e responda todos os reviews dos seus clientes, conquiste a confiança do consumidor e seja a marca escolhida. 1 2 3
A única plataforma 3 x 1 do Brasil Pesquisas multimétricas
para medir a experiência do cliente durante toda a jornada. Identifique promotores e ative o programa de indicação de reviews. Pesquisas 1 2 3
Harmo, uma poderosa máquina de geração de ROI. Escute, interaja,
analise e atue focado nos anseios dos clientes, durante toda a jornada, transformando os seus clientes no principal canal de aquisição de novos clientes.
Grandes marcas atestam a qualidade da nossa plataforma e metodologia
com foco em resultados
NUMBERS Establishments +30k Reviews +15kk Integrations +54k Emails +6,6kk SMS
+250k Answer of Review +1kk
▷ Distributed Process ▷ Scrapping vs Crawlers ▷ Some Concepts
▷ Why Javascript? ▷ Architecture for Scale ▷ Lessons Learning ▷ Example ▷ Conclusion Topics
Distributed Process
None
Scraping vs Crawlers
None
Collector Concepts
Be "Browserless"
Recursion is your friends
Single Responsability
Normalize Data (input & output)
Code reuse with packages
Collector !== Processor
Why Javascript?
Use native streams
Dynamic typing
Do more with less
Most used in the world
Architecture for Scale
None
None
None
Lessons Learning
Use code-base version alert
Code reuse with packages
Create E2E tests from the begin
Be "Browserless"
Use Puppeteer *reduce images
None
None
Use Promise.all
None
Use monorepos
Otherwise it will be chaos
None
None
▷ Web Scraping vs Web Crawling: The Differences ▷ HOW
TO RUN ASYNC JAVASCRIPT FUNCTIONS IN SEQUENCE OR PARALLEL Links
Collector Example
None
None
Leonardo Rifeli | CTO
[email protected]
harmo.me seo local | reviews
| pesquisas