Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Zeppelin, the missing component for your BigData eco-system by DuyHai Doan

Riga Dev Day
March 13, 2016
37

Apache Zeppelin, the missing component for your BigData eco-system by DuyHai Doan

Riga Dev Day

March 13, 2016
Tweet

Transcript

  1. @doanduyhai Who Am I ? Duy Hai DOAN Cassandra technical

    advocate •  talks, meetups, confs •  open-source devs (Achilles, …) •  OSS Cassandra point of contact ☞ [email protected] ☞ @doanduyhai 2
  2. @doanduyhai Datastax •  Founded in April 2010 •  We contribute

    a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 400+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features 3
  3. @doanduyhai Zeppelin Architecture Zeppelin Server Zeppelin Engine 7 R E

    S T WebSocket Spark Interpreter Group Spark SparkSQL Zeppelin Interpreter Factory Tajo Interpreter Flink Interpreter Cassandra Interpreter JVM JVM JVM JVM JVM
  4. @doanduyhai What does Zeppelin provide ? Front-end & display system

    for free Generic back-end with REST APIs & WebSocket Pluggable interpreters system Task scheduler (à la CRON) 8
  5. @doanduyhai Zeppelin Display System •  Raw, Table, HTML, AngularJS with

    Scala •  Graphs •  Dynamic Forms •  View modes •  Export as Iframe
  6. @doanduyhai Interpreter processing lifecycle ①  Receive input commands/data •  as

    raw text •  from form data ②  Process the input commands/data by the external back-end ③  Format the response using Zeppelin display system ④  Send response back to the Zeppelin engine 17
  7. @doanduyhai Core interpreters •  Spark (Spark core, SparkSQL/DataFrame, PySpark) • 

    Spark core = default (or %spark) •  SparkSQL = %sql •  Shell (%sh) •  Markdown (%md) •  AngularJS (%angular) 18
  8. @doanduyhai Third-parties interpreters •  Hive •  Phoenix •  Tajo • 

    Flink •  Ignite •  Lens •  Cassandra •  Geode •  PostgreSQL •  Kylin •  ElasticSearch 19
  9. @doanduyhai Writing an interpreter •  How To •  Simple interpreter

    (AsciiDoc) •  Complex interpreter (Cassandra)
  10. @doanduyhai Steps to write your own interpreter •  Create a

    class that extends Interpreter base class •  Register it in a static block •  Optionnally define default config params 22 static { Interpreter.register("MyInterpreterName", MyClassName.class.getName()); } static { Interpreter.register("MyInterpreterName", MyClassName.class.getName(), new InterpreterPropertyBuilder() .add("property1", "default value", "Description of property1").build()); }
  11. @doanduyhai To register your interpreter as default •  Edit the

    enum ZeppelinConfiguration.ConfVars •  Add your interpreter FQCN in the property ZEPPELIN_INTERPRETERS 23
  12. @doanduyhai To register your interpreter in config files •  Create

    conf/zeppelin-site.xml from conf/zeppelin-site.xml.template •  Add your interpreter FQCN in the property zeppelin.interpreters 24 <property> <name>zeppelin.interpreters</name> <value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter, org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter, org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter, org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter </value> </property>
  13. @doanduyhai Simple AsciiDoc Interpreter 27 Zeppelin Server AsciiDoc Interpreter JVM

    Zeppelin Engine Raw Text Block Raw Text Block Converted To HTML HTML Output ① ② ③ ④ JVM
  14. @doanduyhai Cassandra Interpreter Architecture 29 Cassandra Interpreter JVM Display Results

    as HTML ① ② ⑤ Zeppelin Server JVM Raw Text Block Raw Text Block Cassandra Cassandra Java Driver ③ Async CQL statements ④ Render HTML ⑥
  15. @doanduyhai Cassandra Interpreter Commands 30 Native CQL statements SELECT *

    FROM …; INSERT INTO …; … Schema commands DESCRIBE TABLE …; DESCRIBE KEYSPACE …; … Prepared statements Commands @prepare …; @bind …; @remove_prepared …; Help command HELP; Options Commands @consistency …; @retryPolicy …; @fetchSize …;
  16. @doanduyhai Usability •  UX improvement •  Better table data support

    •  Export data as CSV etc . (PR #725, PR #714, PR #6, PR #89) •  Table pagination … 35