Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Zeppelin, the missing component for your...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Riga Dev Day Riga Dev Day
March 13, 2016
42

Apache Zeppelin, the missing component for your BigData eco-system by DuyHai Doan

Avatar for Riga Dev Day

Riga Dev Day

March 13, 2016
Tweet

More Decks by Riga Dev Day

Transcript

  1. @doanduyhai Who Am I ? Duy Hai DOAN Cassandra technical

    advocate •  talks, meetups, confs •  open-source devs (Achilles, …) •  OSS Cassandra point of contact ☞ [email protected] ☞ @doanduyhai 2
  2. @doanduyhai Datastax •  Founded in April 2010 •  We contribute

    a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 400+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features 3
  3. @doanduyhai Zeppelin Architecture Zeppelin Server Zeppelin Engine 7 R E

    S T WebSocket Spark Interpreter Group Spark SparkSQL Zeppelin Interpreter Factory Tajo Interpreter Flink Interpreter Cassandra Interpreter JVM JVM JVM JVM JVM
  4. @doanduyhai What does Zeppelin provide ? Front-end & display system

    for free Generic back-end with REST APIs & WebSocket Pluggable interpreters system Task scheduler (à la CRON) 8
  5. @doanduyhai Zeppelin Display System •  Raw, Table, HTML, AngularJS with

    Scala •  Graphs •  Dynamic Forms •  View modes •  Export as Iframe
  6. @doanduyhai Interpreter processing lifecycle ①  Receive input commands/data •  as

    raw text •  from form data ②  Process the input commands/data by the external back-end ③  Format the response using Zeppelin display system ④  Send response back to the Zeppelin engine 17
  7. @doanduyhai Core interpreters •  Spark (Spark core, SparkSQL/DataFrame, PySpark) • 

    Spark core = default (or %spark) •  SparkSQL = %sql •  Shell (%sh) •  Markdown (%md) •  AngularJS (%angular) 18
  8. @doanduyhai Third-parties interpreters •  Hive •  Phoenix •  Tajo • 

    Flink •  Ignite •  Lens •  Cassandra •  Geode •  PostgreSQL •  Kylin •  ElasticSearch 19
  9. @doanduyhai Writing an interpreter •  How To •  Simple interpreter

    (AsciiDoc) •  Complex interpreter (Cassandra)
  10. @doanduyhai Steps to write your own interpreter •  Create a

    class that extends Interpreter base class •  Register it in a static block •  Optionnally define default config params 22 static { Interpreter.register("MyInterpreterName", MyClassName.class.getName()); } static { Interpreter.register("MyInterpreterName", MyClassName.class.getName(), new InterpreterPropertyBuilder() .add("property1", "default value", "Description of property1").build()); }
  11. @doanduyhai To register your interpreter as default •  Edit the

    enum ZeppelinConfiguration.ConfVars •  Add your interpreter FQCN in the property ZEPPELIN_INTERPRETERS 23
  12. @doanduyhai To register your interpreter in config files •  Create

    conf/zeppelin-site.xml from conf/zeppelin-site.xml.template •  Add your interpreter FQCN in the property zeppelin.interpreters 24 <property> <name>zeppelin.interpreters</name> <value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter, org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter, org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter, org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter </value> </property>
  13. @doanduyhai Simple AsciiDoc Interpreter 27 Zeppelin Server AsciiDoc Interpreter JVM

    Zeppelin Engine Raw Text Block Raw Text Block Converted To HTML HTML Output ① ② ③ ④ JVM
  14. @doanduyhai Cassandra Interpreter Architecture 29 Cassandra Interpreter JVM Display Results

    as HTML ① ② ⑤ Zeppelin Server JVM Raw Text Block Raw Text Block Cassandra Cassandra Java Driver ③ Async CQL statements ④ Render HTML ⑥
  15. @doanduyhai Cassandra Interpreter Commands 30 Native CQL statements SELECT *

    FROM …; INSERT INTO …; … Schema commands DESCRIBE TABLE …; DESCRIBE KEYSPACE …; … Prepared statements Commands @prepare …; @bind …; @remove_prepared …; Help command HELP; Options Commands @consistency …; @retryPolicy …; @fetchSize …;
  16. @doanduyhai Usability •  UX improvement •  Better table data support

    •  Export data as CSV etc . (PR #725, PR #714, PR #6, PR #89) •  Table pagination … 35