Slide 1

Slide 1 text

Introduction to Druid, fast distributed data store Nikita Salnikov-Tarnovski @iNikem

Slide 2

Slide 2 text

Why me? • I introduced Druid to a team • We are happy about it • I like their design choices •but not there code :(

Slide 3

Slide 3 text

Problem • Monitor what end-users are doing • Monitor what servers are doing • Put it together

Slide 4

Slide 4 text

Questions to ask • What was the slowest service during last flash sale? • What sql query has the biggest impact on user satisfaction? • Who are my most unhappy users this week? • Are we getting better?

Slide 5

Slide 5 text

Data to collect { "accountId": "XXXX", "transactionId": "9b6bbb93-0f64-389b-beae-ccd294f2286d", "jvmId": [ "H0QXvFsZ"], "originatingJvm": "H0QXvFsZ", "applicationKey": "535624dd815fb8762c378ac6b15937dc", "rootCause": [454708], "problemId": ["454708:600492565"], "problemsDuration": 4572, "userId": 42, "transactionStart": "1493548894024", "transactionDuration": "5432", "success": "0", "slow": "0", "failed": "1", "status": "failed", "serviceId": "6d17705ebf2724d96da48cc349e6c12d", "jobId": null, "isBrowser": false, "browserAgent": "MSIE", "country": "US" } { "accountId": "XXXX", "jvmId": "YYY", "timestamp": 1493548969876, "allocationRate": 14968164, "usedMemHeap": 574524040, "usedMemNative": 1125826560, "usedPermGen": 64078208 }

Slide 6

Slide 6 text

Data point • Timestamp • Dimensions •who, where, what •means to select a subset of data • Metrics •how many •measured values you are interested in

Slide 7

Slide 7 text

What smart people said • You need columnar DB!

Slide 8

Slide 8 text

http://cs-www.cs.yale.edu/homes/dna/talks/abadi-sigmod08-slides.pdf • Easy to insert new records • Faster to read full record Row based format

Slide 9

Slide 9 text

http://cs-www.cs.yale.edu/homes/dna/talks/abadi-sigmod08-slides.pdf • More expensive inserts • Can read only relevant data Column based format

Slide 10

Slide 10 text

Columnar databases we tried • InfluxDB • MonetDB • Druid

Slide 11

Slide 11 text

Results • InfluxDB • MonetDB • Druid

Slide 12

Slide 12 text

Why? • Stability issues • We don’t have expertise neither in Go no in C

Slide 13

Slide 13 text

Druid • druid.io • Open source • Active community • Open for extensions

Slide 14

Slide 14 text

Imply • imply.io • Packages and supports Druid • Add-ons Pivot, Plywood and PlyQL

Slide 15

Slide 15 text

Druid tribe https://imply.io/docs/latest/

Slide 16

Slide 16 text

Practical implications +Failure of a single node does not affect you much -Very high operational overhead

Slide 17

Slide 17 text

Data storage • All data is stored in files called “segments” • Contains all the information for some period of time •including indices, dictionaries • Immutable columnar format • Can be further sharded

Slide 18

Slide 18 text

Practical implications +Easy to distribute -Cannot update individual records. Have to rebuild and replace the whole segment.

Slide 19

Slide 19 text

Data distribution • Segments are held in deep storage •HDFS, S3, Azure, Google Cloud, Cassandra, etc • Coordinator says to each historical node what to load • Historicals can be organised in tiers

Slide 20

Slide 20 text

Distributed query

Slide 21

Slide 21 text

Practical implications +Every single historical can die without any impact +Coordinator can die with very little impact +Separate hot and cold data +Trade money for speed -None for historical :) -Broker is a single point of failure!

Slide 22

Slide 22 text

Queries • SQL-like PlyQL • JSON over HTTP • We have built a small DSL over that json format

Slide 23

Slide 23 text

Timeseries

Slide 24

Slide 24 text

GroupBy

Slide 25

Slide 25 text

TopN

Slide 26

Slide 26 text

Practical implications +None :) -No joins -Javascript functions or custom extensions

Slide 27

Slide 27 text

Benchmarks are lies! • They are all requests to our Druid for some period • It says nothing about performance in your case

Slide 28

Slide 28 text

Data ingestion • Files/Hadoop • Stream push via Tranquility • Stream poll via Kafka

Slide 29

Slide 29 text

http://druid.io/docs/0.10.0/design/indexing-service.html

Slide 30

Slide 30 text

Kafka indexing • m4.xlarge, 16GB RAM, 4vCPU • 20-60K/sec per partition, ~ 1.5-5B/day

Slide 31

Slide 31 text

Querying recent data

Slide 32

Slide 32 text

Practical implications +“Guaranteed” exactly-once delivery +New data is available immediately -Complex machinery which easily breaks

Slide 33

Slide 33 text

Roll-up • We collect data every 5 seconds • But query with granularity 1 minute • We can aggregate data during index • usedMemHeap -> max(usedMemHeap) • allocationRate -> avg(allocationRate) { "accountId": "XXXX", "jvmId": "YYY", "timestamp": 1493548969876, "allocationRate": 14968164, "usedMemHeap": 574524040, "usedMemNative": 1125826560, "usedPermGen": 64078208 }

Slide 34

Slide 34 text

Roll-up results • 1B records • 484G in Kafka, uncompressed json • 185G with unique ids • 9.18G rolled-up data without ids

Slide 35

Slide 35 text

Practical implications +Huge savings in size -Lost individuality

Slide 36

Slide 36 text

Take away • We are quite happy with it :) • Good tool for quite narrow problem

Slide 37

Slide 37 text

Pros • Works OK even without much tuning • Extensible • Easy to change “schema”

Slide 38

Slide 38 text

Cons • Has operational overhead • Effectively non-updatable • Good only for quite specific queries

Slide 39

Slide 39 text

Solving performance problems is hard. We don’t think it needs to be. @JavaPlumbr/@iNikem http://plumbr.eu