This is a presentation I gave on Kafka (http://incubator.apache.org/kafka/)- a persistent, distributed messaging system developed at LinkedIn.
I gave it during one of our monthly tech presentation evenings in our office.
KafkaA little introduction
View Slide
Pub-Sub Messaging System
Distributed
Performance
Disk/Memory PerformanceSource: http://queue.acm.org/detail.cfm?id=1563874Disk SSD Memory1101001,00010,000100,0001M10M100M1000MRandom accessSequential AccessRead values/second
Disk/Memory PerformanceSource: http://queue.acm.org/detail.cfm?id=1563874Disk SSD Memory1101001,00010,000100,0001M10M100M1000MRandom accessSequential AccessRead values/secondSequential disk readfaster than randommemory read
Persistent
Length Magic Value Checksum Payload4 bytes 1 byte 4 bytes n bytes
TokenOffset: 0Broker: kafka.localTopic: TestingInputMR JobOutputOffset: 130098Broker: kafka.localTopic: TestingOutputSequence File
Useful Things• http://incubator.apache.org/kafka/• https://github.com/pingles/clj-kafka