Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kudu, the new big data antelope

Apache Kudu, the new big data antelope

Apache Hadoop and it's distributed file system are probably the most representative to tools in the Big Data Area. They have democratised distributed workloads on large datasets for hundreds of companies already, just in Paris. But these workloads are append-only batches. However, life in companies can't be only described by fast scan systems. We need writes but also updates. We need fast random access and mutable data on huge dataset too. And when it comes to supporting continuous ingestions of event, mix HDSF, and columnar database (e.g., HBase or Cassandra) within the same architecture may be complicated. To solve all these issues, I will present Apache Kudu, a fast analytics storage layer.

Ed81876bf33da90cdae47ce9b8df056b?s=128

Loïc DIVAD

April 10, 2017
Tweet

Transcript

  1. None
  2. > println(sommaire) 1. Why Kudu ? 2. The Overview 3.

    Kudu in practice 2
  3. Why Kudu ? Motivation, history and design goal .... 3

  4. Why Kudu ? 4 • HDFS: ◦ Immutable data ◦

    Batch Ingestion ◦ Scan on BigData & analytics queries • HBase ◦ Mutable data ◦ Fast Random Access ◦ Find, write & update individual rows
  5. Why Kudu ? - The HDFS / HBase struggle 5

    • Hard to flush streaming data • Hard to operate data correction • Hard to maintain compaction routines
  6. • Kudu: ◦ High throughput for big scans ◦ Low-latency

    for short accesses Why Kudu ? - To fill in the gap 6 Not a better HBase neither a better HDFS.
  7. Why Kudu ? • One system (to rule them all)

    • New data available immediately • Good for late arrivals and data correction 7
  8. Why Kudu ? - For particular use-case • >20 Billion

    records / day and growing • Identify and resolve issues quickly (1h latency) • Can search for individual records 8 The Xiaomi use case: Service monitoring & troubleshooting tool Before Kudu
  9. Why Kudu ? - For particular use-case • Only one

    storage engine • No dumping routines • Late data arrivals made simple 9 The Xiaomi use case: Service monitoring & troubleshooting tool Now
  10. Kudu - Chronology 10 Sep. 2015 Dec. 2015 Jul. 2016

    Fev. 2017 The Kudu Paper Apache Incubating Apache Top Level Project Joins Cloudera CDH 5.10
  11. The Overview Design, concepts & benchmarks 11

  12. It’s all about tables • Just like SQL ... •

    With a primary (of one or more columns) • Strongly-typed columns ◦ Int ◦ boolean ◦ timestamp ◦ float ◦ string 12 Source: The Kudu documentation
  13. They are partitioned ... • Partitioned into tablets • Over

    a subset of the primary key • Tablet Servers host tablet on the local disk drives • With 3 or 5 replicas each 13 Source: The Kudu documentation
  14. … and replicated ! • The Master Server keep trace

    each Tablets • Each Leader are responsible of the log reciplication • Only inserts and updates sends data over the network 14 Source: The Kudu documentation
  15. The Raft consensus The leader election is trigged by the

    election timeout • Amount of time before node turn into a candidate • Randomized to be between 150ms and 300ms • Once the time is over the new candidate start an election term 15 Source: The Secret Life of Data
  16. The Raft consensus • The leader sends out append entries

    messages to its followers. • Followers do not accept direct writes from clients • The append entries are send each heartbeat timeout intervals 16 Source: The Secret Life of Data
  17. The Raft consensus • If a follower stop receiving heartbeats

    it start a new election term • Now it’s a new candidates • Once a new leader is elected it acknowledges incoming entry • The consistency of the tablet is respected 17 Source: The Secret Life of Data
  18. The insert / read path • First insert lines in

    a MemRowSets • Consult all of the existing DiskRowSets • 4KB pages Blooms Filter cached in-memory 18
  19. Benchmark 19

  20. Benchmark - Kudu vs Parquet Kudu outperforms Parquet by 31%

    TCPH : - 75 server cluster - TPC-H Scale Factor 100GB - Enough RAM to fit dataset 20 Source: The Kudu With Paper --drop_log_memory false --global_log_cache_size_limit_mb
  21. Benchmark - Kudu vs Parquet 21 SELECT n_name, sum(l_extendedprice *

    (1 - l_discount)) as revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01' GROUP BY n_name ORDER BY revenue desc Kudu outperforms Parquet by 31% TCPH : - 75 server cluster - TPC-H Scale Factor 100GB - Enough RAM to fit dataset
  22. Benchmark - Kudu vs HBase - HBase outperforms on a

    single type of query type - Kudu get close when we mix different types of queries YCSB : 22 A 50% random-read, 50% update B 95% random-read, 5% update C 100% random read D 95% random read, 5% insert Source: The Cloudera White Paper
  23. Kudu in practice Installation, example & demo. 23

  24. Kudu - Getting started Installation : • Kudu Quickstart VM

    • Docker Images • CDS & Parcel Cloudera 24 Usage : • No Interactive Shell ... • Good integration with Impala • Simple web ui
  25. Kudu & Impala CREATE EXTERNAL TABLE default.mut_mut( id INT, amount

    STRING, -- SOME COLUMNS … ) TBLPROPERTIES ( 'kudu.table_name'='mut_mut', 'kudu.key_columns'='id,...,', 'kudu.master_addresses'='127.0.0.1,...,', 'storage_handler'='com.cloudera.kudu.hive.KuduStorageHandler' ) -- STORED AS KUDU 25
  26. Kudu & Impala • impala> CREATE TABLE … PRIMARY KEY

    … STORED AS KUDU AS SELECT • impala> CREATE TABLE … AS SELECT * FROM … • impala> SELECT * FROM … TABLE … • impalad motoring: ◦ http://<host>:<port>/query_plan?query_id=<ip> 26
  27. Kudu & Apache Spark // spark-shell --packages org.apache.kudu:kudu-spark_2.10:1.1.0 libraryDependencies +=

    "org.apache.kudu" % "kudu-spark_2.10" % "1.2.0" import org.apache.kudu.spark.kudu._ val masters = "<host>:<port>,<host2>:<port2>" val kuduContext = new KuduContext(masters) val df: DataFrame = sparkSession.sqlContext.read.options( Map( "kudu.table" -> tableName, "kudu.master" -> masters ) ).kudu 27
  28. Kudu & Kafka connect { "name": "kudu-sink", "config": { "connector.class":

    "com.datamountaineer.streamreactor.connect.kudu.sink.KuduSinkConnector", "tasks.max": "2", "connect.kudu.master": "<host>:<port>", "connect.kudu.sink.kcql": "INSERT INTO kudu_tag SELECT * FROM kudu-tag", "topics": "<kudu-topic>" } } 28 <dependency> <groupId>org.apache.kudu</groupId> <artifactId>kudu-client</artifactId> <version>1.2.0</version> </dependency> with the existing connector from Datamontainer (Confluent 3.1+) Or write a custom connector thanks to the Java Kudu API $ curl -X POST -H "Content-Type: application/json" --data <file.json> <connect.url>
  29. Kudu in practice - The demo • AWS - EC2

    • Cloudera Express 5.10 • Confluent 3.2 • Kudu 1.2 29
  30. The demo The very very unexpected use-case 30

  31. The demo 31

  32. Conclusion Kudu : • Fast analytics on Fast data •

    Not a better HBase neither a better HDFS • A real citizen of Hadoop clusters 32
  33. 33 MERCI

  34. 34

  35. Annex 35

  36. 36

  37. Question Cloudera ? • Spawn Cloudera Director + Manager (45min)

    • Configuration (15 min) • Ajout Confluent • Mise en place de l’app 37 • c4.large • m4.large • Confluent 3.2 • Kudu 1.2 • Java 1.8
  38. Question PMU Usecase Sales prediction for one hour. 38 What

    a PMU use-case looks like ? -3H -2H -1H -3W -1W -1D CA
  39. Question Data compression ? 39

  40. Question Impala ? 40

  41. Question Intel ? • 1984 NOR Flash memory • 1989

    NAND Flash memory • 2015 3D XPoint ◦ ~1000x faster ◦ ~1000x faster ◦ ~10x denser 41 = +
  42. Question Partition ? 42

  43. Question Raft 43

  44. Question Raft 44