Slide 23
Slide 23 text
Life of a Tweet
What open source technology do we use behind the scenes when we tweet?
tweet fanout
write search batch
Hadoop is used for many things at Twitter, like counting words :)
scribe logs, batch processing, recommendations, trends, user modeling and more!
10,000+ hadoop servers, 100,000+ daily hadoop jobs,10M+ daily hadoop tasks
Parquet is a columnar storage format for Hadoop
https://parquet.incubator.apache.org
Scalding is our Scala DSL for writing Hadoop jobs
https://github.com/twitter/scalding
!
fin