Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RethinkDB Java Driver

Josh Kuhn
September 24, 2015

RethinkDB Java Driver

A dive into the details of the Java driver, RethinkDB's first official driver for a statically typed language

Josh Kuhn

September 24, 2015
Tweet

Other Decks in Technology

Transcript

  1. The Official RethinkDB Java Driver or how I learned to

    stop worrying and love auto-generated code
  2. What it is • An official driver for Java •

    The 4th official driver from RethinkDB • The current official drivers are for Ruby, Python and JavaScript • The first official driver in a statically typed language
  3. Official Motivation • There are two community drivers for Java

    • One by dkhenry, and one by npiv • Both are pretty awesome, but both are built for the protobuf wire format • Neither is being actively maintained :(
  4. Official Motivation • All traces of protobufs are being removed

    in RethinkDB 2.2 • This is great news for performance, and simplifies the server code • But it means neither community Java driver will work anymore
  5. Official Motivation • In addition, both community drivers were incomplete,

    implementing only some terms • So we decided to build our own • The official driver is heavily based on npiv’s driver, since it was written with Java 8 lambdas in mind
  6. What it looks like The bad: Python: r.table('foo') .filter({'user': {'state':

    'AK'}}) Java: r.table("foo") .filter(r.hashmap( "user", r.hashmap("state", "AK") ))
  7. What a driver is • A library in your native

    language that handles connecting to RethinkDB and serializing queries • They're very integrated with the language itself • r.table('users')
 .map(lambda user: user['age'] / 2 + 7)
  8. How a driver works • We create a class for

    each possible ReQL term • This is from the Python driver
  9. How a driver works • Then we create methods on

    the term superclass that allow you to create instances of those ReQL terms • This is where the chaining comes from
  10. How a driver works • This is a lot of

    boilerplate • Java also requires one class per file, so there's a lot of stuff to write
  11. How a driver works • This is the Java implementation

    of a few of the methods • This is similar to the python driver, but a bit noisier
  12. How a driver works • Ultimately, what's built looks like

    a tree structure • So r.expr(1) + r.expr(2) * 3 • turns into something like:
 Add(Datum(1), Mul(Datum(2), Datum(3))) • This is close to how it gets serialized to JSON
  13. Challenges • ReQL wasn't designed with statically typed languages in

    mind • ReQL uses lambdas/anonymous functions • New driver increases maintenance by 33.33%! • Some cosmetic challenges because Java
  14. Maintainability • Drivers are already lots of boilerplate, but mix

    in a little Java and the boiler is on maximum heat • The solution for us was to generate Java code from a JSON specification of ReQL, using python.
  15. Code generation • Not a new thing in Java, people

    do it all the time • We happen to have more Python devs than Java devs at RethinkDB, so it made sense for us to use Python tools • The actual code to do the generation uses Mako templates and outputs all of the classes and methods we need in one shot
  16. The JSON specification • Here's an example of the term

    from the master JSON file • This says count should show up on ReQL expressions, not the top-level (i.e. there's no r.count()) • It has 3 signatures:
 stream.count()
 stream.count("foo")
 stream.count(x -> x.gt(4)) • Each term also has an id "COUNT": { "include_in": ["T_EXPR"], "signatures": [ ["T_EXPR"], ["T_EXPR", "T_EXPR"], ["T_EXPR", "T_FUNC1"] ], "id": 43 },
  17. Lambdas in Java 8 • A big reason it's much

    more feasible to create a Java driver is the addition of lambdas in Java 8 • Previous to that, the only way we could have implemented ReQL would be with anonymous inner classes
  18. Lambdas in Java 8 • Java 7:
 r.range(0, 100).reduce(class ReQLFunction2(){


    public ReQLExpr apply(Var x, Var y){
 return x.add(y);
 }
 })
  19. Lambdas in Java 8 • The way they work (roughly)

    is they create an anonymous class instance for any interface with exactly one method. • This is backwards compatible with existing interfaces, so it fits into Java really well
  20. Lambdas in Java 8 • Lambdas are definitely a big

    improvement, but they have 2 drawbacks
  21. Lambda problem #1 • They don't have a nice syntax

    for application. You can't do something like:
 
 Function<Integer, Integer> pred = x -> x + 1;
 pred(3); // not allowed! • You have to call the method on the object like:
 pred.apply(3);
  22. Lambda problem #1 • This isn't a big deal for

    the drivers since the user never applies the lambdas, the driver does.
  23. Lambda problem #2 • A bigger issue for us is

    that lambdas rely on the type system to be so concise • Consider a method like:
 public ReQLExpr map(Object…) {…} • This takes any number of any argument type
  24. Lambda problem #2 • The problem comes when we try

    to give it a lambda as an argument:
 r.table("foo").map(x -> x) • How can Java know what interface to implement? • It can't. When we implement the methods, we have to know where lambdas can be used, and how many arguments they take.
  25. Expanding method signatures • Where we can, we put in

    the full signature. • For instance, count has exactly three signatures:
 public ReQLExpr count() {…}
 public ReQLExpr count(Object expr) {…}
 public ReQLExpr count(ReQLFunction1 func) {…}
  26. Expanding method signatures • But some other terms are a

    little less amenable. • map can take variable numbers of arguments, • the number of arguments it takes is how many arguments affects the type of its lambda argument
  27. Expanding method signatures • Use it like:
 r.range().map(x -> x)


    r.range().map(r.range(), (x, y) -> x + y)
 r.range().map(r.range(), r.range(), (x,y,z) -> x+y+z) • Signatures:
 public ReQLExpr map(ReQLFunction1 f)
 public ReQLExpr map(Object e, ReQLFunction2 f)
 public ReQLExpr map(Object e, Object e, ReQLFunction3 f)

  28. Expanding method signatures • For terms like this, we just

    expand a finite amount of them out, even though technically ReQL doesn't have a limit
  29. Java nits • No Map or List literals. • Something

    shared by Python, Ruby and JavaScript is very concise literals for these data structures. • ReQL sort of expects these to be simple to express, things like pathspecs are convenient when this assumption is true • r.table('foo').filter({nested: {field: "value"}})
  30. Java nits • No operator overloading. • It's true, the

    most popular driver (JavaScript) doesn't have operator overloading either, but:
 
 python: some_object["field"]["subfield"]
 JavaScript: someObject("field")("subfield")
 java: someObject.getField("field").getField("subfield")
  31. Current status • Currently, I'm working on converting the ReQL

    language tests to exercise the Java driver • Right now the driver is in "unofficial alpha", available on the branch josh/java-driver • If you want to go through the trouble of checking it out and compiling it, you can try it out now • For everyone else, we should have a packaged beta version soon
  32. Future work • Need to add backtraces, profiles, pretty-printing queries

    • The JSON library should be swappable to work with more frameworks • We want to add asynchronous network IO like the other drivers (probably using Netty)