Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Custom Query Languages: Why? How?

Custom Query Languages: Why? How?

J on the Beach 2017 (https://jonthebeach.com/)

Even the best, biggest, beachiest data out there is useless if users can't easily search and analyze it. Under the right circumstances, a custom query language can be a powerful interface to that data, but only if that interface is chosen and developed consciously, with top priority given to creating a fitting domain abstraction, a first-class user experience, and a simple yet flexible implementation that doesn't reinvent the wheel.

These are takeaways from the real-world experiences of ÜberResearch and Valo: two different companies with very different needs, which nevertheless ended up taking similar approaches to the selection and creation of query languages as data interfaces. From the lessons they've learned -- some more painfully than others -- we'll construct a roadmap for choosing, designing, and implementing a custom query language that lets your users interact with your big, beautiful data in all its glory.

654527a5cff1756177ef0b1bb0af7aa3?s=128

Anjana Sofia Vakil

May 18, 2017
Tweet

Transcript

  1. Custom Query Languages: Why? How? @AnjanaVakil - J on the

    Beach 2017
  2. HOLA! I’m @AnjanaVakil Software Engineer/DSL-builder, ÜberResearch with input from Tobias

    Johansson Technical Lead, Valo
  3. DIMENSIONS uberresearch.com/dimensions Scientific research funding data for funders, researchers, and

    research organizations.
  4. VALO valo.io Real-time and historical analytics on data streams. Plus

    beachy conferences.
  5. WHAT is a Custom Query Language (CQL) ?

  6. CUSTOM QUERY LANGUAGE (CQL) domain- specific

  7. CUSTOM QUERY LANGUAGE (CQL) data retrieval & analysis

  8. CUSTOM QUERY LANGUAGE (CQL) text-based interface

  9. “ a common text that acts as both executable software

    and a description that domain experts can read Martin Fowler, Domain Specific Languages
  10. search grants for "malaria" where funder.acronym="NIH" return categories, years, researchers:[surname]

    aggregate funding sort by funding DIMENSIONS query language
  11. VALO query language from /streams/sensors/air group by sampleTime window of

    1 minute select sampleTime, sensor, avg(pollution)
  12. WHY would you want your own CQL ?

  13. MODELING

  14. VALO modeling time from historical /streams/tenant/collection/name group by timestamp window

    of 3 seconds every 1 second select timestamp, count(), sum(value) order by timestamp
  15. Display from Model DECOUPLING Model from Storage

  16. DIMENSIONS decoupling SOLR

  17. DIMENSIONS decoupling SOLR

  18. DIMENSIONS decoupling SOLR </> Elastic ?

  19. “ Trying to describe a domain using a DSL is

    useful even if the DSL is never implemented. It can be beneficial just as a platform for communication. Martin Fowler, Domain Specific Languages
  20. HOW should you design a CQL ?

  21. 1. Clarity 2. Concision 3. Familiarity 4. Responsiveness WHAT MAKES

    A GREAT USER INTERFACE? 5. Consistency 6. Aesthetics 7. Efficiency 8. Forgiveness https://www.smashingmagazine.com/user-interface-design-in-modern-web-applications
  22. 1. 2. Concision 3. Familiarity 4. the user can easily

    express what they want 5. 6. Aesthetics 7. Efficiency 8.
  23. VALO historical data from historical /streams/sensors/air group by sampleTime window

    of 1 minute select sampleTime, sensor, avg(pollution)
  24. 1. Clarity 2. 3. Familiarity 4. Responsiveness the user doesn’t

    often need the docs 5. Consistency 6. 7. 8. Forgiveness not
  25. Unknown source 'articles' Did you mean one of these sources?

    applications clinical_trials grants patents publications DIMENSIONS error messages
  26. it will change but no matter how good your design

  27. Get user feedback ASAP expect & embrace change (as usual)

    Design for flexibility
  28. HOW can you implement a CQL ?

  29. PARSING YOUR CQL Terrence Parr, The Definitive ANTLR4 Reference

  30. Use a parser generator Prototype quickly, giving up control over

    some details of lexing/parsing. e.g. ANTLR, parboiled2 PARSING YOUR CQL Roll your own Retain full control over lexing/parsing and internal syntactic representation.
  31. Use a parser generator Prototype quickly, giving up control over

    some details of lexing/parsing. e.g. ANTLR, parboiled2 PARSING YOUR CQL Roll your own Retain full control over lexing/parsing and internal syntactic representation.
  32. ANTLR (v4) antlr.org Java-based parser generator with targets for various

    other languages (e.g. Python, JS, …)
  33. THE ANTLR4 FLOW grammar lexer parser AST listener

  34. grammar Query; query: target results EOF; target: 'search' name filter?;

    results: ('return' name)+; name: [a-z]+; ANTLR4 GRAMMAR
  35. # Generated by ANTLR 4.5 from antlr4 import * class

    QueryListener: def enterTarget(self, ctx): pass def exitTarget(self, ctx): pass ANTLR4 LISTENER
  36. Let’s RECAP !

  37. CUSTOM QUERY LANGUAGES WHY bother An executable model of your

    domain/logic. Decoupling display, model, and storage. HOW to design Language as interface. Expect & embrace change. HOW to implement Parser generators can speed up prototyping, at the cost of some control.
  38. GRACIAS! I’m @AnjanaVakil anjana@uberresearch.com Huge thanks to Tobias Johansson &

    Valo! Presentation template by SlidesCarnival