Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Timothy
Search
abhinay
June 28, 2012
Programming
2
1.5k
Timothy
Write Hadoop Jobs in NodeJS by Antonio Garrote and Abhinay Mehta
abhinay
June 28, 2012
Tweet
Share
Other Decks in Programming
See All in Programming
Java on Azure で LangGraph!
kohei3110
0
170
プロダクト志向ってなんなんだろうね
righttouch
PRO
0
120
Railsアプリケーションと パフォーマンスチューニング ー 秒間5万リクエストの モバイルオーダーシステムを支える事例 ー Rubyセミナー 大阪
falcon8823
3
830
Composerが「依存解決」のためにどんな工夫をしているか #phpcon
o0h
PRO
1
130
Benchmark
sysong
0
230
Webの外へ飛び出せ NativePHPが切り拓くPHPの未来
takuyakatsusa
1
200
20250628_非エンジニアがバイブコーディングしてみた
ponponmikankan
0
200
Go1.25からのGOMAXPROCS
kuro_kurorrr
1
790
deno-redisの紹介とJSRパッケージの運用について (toranoana.deno #21)
uki00a
0
130
なぜ「共通化」を考え、失敗を繰り返すのか
rinchoku
1
420
都市をデータで見るってこういうこと PLATEAU属性情報入門
nokonoko1203
1
550
つよそうにふるまい、つよい成果を出すのなら、つよいのかもしれない
irof
1
300
Featured
See All Featured
The Cult of Friendly URLs
andyhume
79
6.4k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.8k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
700
KATA
mclloyd
29
14k
Making the Leap to Tech Lead
cromwellryan
134
9.3k
The Language of Interfaces
destraynor
158
25k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.4k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Faster Mobile Websites
deanohume
307
31k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
Transcript
Timothy https://github.com/forward/timothy Antonio Garrote Abhinay Mehta
Hadoop MapReduce in Node.js
Hadoop • Distributed processing of large data • Derived from
Google MapReduce and GFS • Fast becoming the de facto standard • Large ecosystem • Java • Master/Slave setup
Hadoop Architecture HDFS MapReduce Output Input
• Open Source • Uses Hadoop Streaming API • No
binaries • NPM support Timothy
$ npm install timothy
require('timothy') .configure({ config: './hadoop.xml', input: '/tmp/loremipsum.txt', output: '/tmp/wordcount/', name: 'Timothy
Word Count Example' }) .map(function(line){ line.split(" ").forEach(function(word) { emit(word, 1); }); }) .reduce(function(word,counts){ emit(word, counts.length); }) .run(function(err){ .. }); Word Count
require('timothy') .map(function(line){ line.split(" ").forEach(function(word) { emit(word, 1); }); }) .reduce(function(word,counts){
emit(word, counts.length); }) .runLocal("/local/input/path"); Local Runner
require('timothy') .map(function(line){ line.split(" ").forEach(function(word) { emit(word, 1); }); }) .reduce(function(word,counts){
emit(word, counts.length); }) .runLocal("/local/input/path"); Local Runner
Dependencies require('timothy') .configure({ ... }) .dependencies({'string' : '0.2.1-2'}) .map(function(line){ var
S = require('string'); line.split(" ").forEach(function(word) { if (S(word).isAlphaNumeric()) { emit(word, 1); } }); }) ...
Dependencies require('timothy') .configure({ ... }) .dependencies({'string' : '0.2.1-2'}) .map(function(line){ var
S = require('string'); line.split(" ").forEach(function(word) { if (S(word).isAlphaNumeric()) { emit(word, 1); } }); }) ...
Setup require('timothy') .configure({ ... }) .dependencies({'string' : '0.2.1-2'}) .map(function(line){ line.split("
").forEach(function(word) { if (S(word).isAlphaNumeric()) { emit(word, 1); } }); }) ... .setup(function() { S = require('string'); })
Setup require('timothy') .configure({ ... }) .dependencies({'string' : '0.2.1-2'}) .map(function(line){ line.split("
").forEach(function(word) { if (S(word).isAlphaNumeric()) { emit(word, 1); } }); }) ... .setup(function() { S = require('string'); })
require('timothy') .configure({ ... }) .map(function(line){ emit(line, 1); }) .reduce(function(line,counts){ emit(line,
counts.length); }) .map(function(line, count){ emit(line[0], count); }) .reduce(function(letter, counts){ var sum = counts.reduce(function(a,i) { return a+i; }); emit(letter, sum); }) .run(); Method Chaining
• Update Job Status • Create and update counters •
Pass env vars to jobs • More examples on github page Other features
Motivation • Big data is now a thing • Lower
the barrier to entry • Benefits of NodeJS on Hadoop • Development Speed
Limitations • Setup method cannot block • Lack support for
lexical scoping • NodeJS needs to be pre-installed on slaves • Probably more we haven’t thought of yet!
Improvements • Bundling local JS scripts • JSON for intermediary
data format • JVM support
Thank you! https://github.com/forward/timothy