Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Custom Query Languages: Why? How?

Custom Query Languages: Why? How?

J on the Beach 2017 (https://jonthebeach.com/)

Even the best, biggest, beachiest data out there is useless if users can't easily search and analyze it. Under the right circumstances, a custom query language can be a powerful interface to that data, but only if that interface is chosen and developed consciously, with top priority given to creating a fitting domain abstraction, a first-class user experience, and a simple yet flexible implementation that doesn't reinvent the wheel.

These are takeaways from the real-world experiences of ÜberResearch and Valo: two different companies with very different needs, which nevertheless ended up taking similar approaches to the selection and creation of query languages as data interfaces. From the lessons they've learned -- some more painfully than others -- we'll construct a roadmap for choosing, designing, and implementing a custom query language that lets your users interact with your big, beautiful data in all its glory.

Anjana Sofia Vakil

May 18, 2017
Tweet

More Decks by Anjana Sofia Vakil

Other Decks in Programming

Transcript

  1. Custom Query Languages:
    Why? How?
    @AnjanaVakil - J on the Beach 2017

    View Slide

  2. HOLA!
    I’m @AnjanaVakil
    Software Engineer/DSL-builder, ÜberResearch
    with input from Tobias Johansson
    Technical Lead, Valo

    View Slide

  3. DIMENSIONS
    uberresearch.com/dimensions
    Scientific research funding data
    for funders, researchers, and
    research organizations.

    View Slide

  4. VALO
    valo.io
    Real-time and historical
    analytics on data streams.
    Plus beachy conferences.

    View Slide

  5. WHAT
    is a Custom Query Language (CQL)
    ?

    View Slide

  6. CUSTOM QUERY LANGUAGE (CQL)
    domain-
    specific

    View Slide

  7. CUSTOM QUERY LANGUAGE (CQL)
    data retrieval
    & analysis

    View Slide

  8. CUSTOM QUERY LANGUAGE (CQL)
    text-based
    interface

    View Slide


  9. a common text that acts as both
    executable software and a description
    that domain experts can read
    Martin Fowler, Domain Specific Languages

    View Slide

  10. search grants
    for "malaria"
    where funder.acronym="NIH"
    return categories, years,
    researchers:[surname]
    aggregate funding
    sort by funding
    DIMENSIONS
    query language

    View Slide

  11. VALO
    query language
    from /streams/sensors/air
    group by sampleTime
    window of 1 minute
    select sampleTime, sensor,
    avg(pollution)

    View Slide

  12. WHY
    would you want your own CQL
    ?

    View Slide

  13. MODELING

    View Slide

  14. VALO
    modeling time
    from historical
    /streams/tenant/collection/name
    group by timestamp window of 3
    seconds every 1 second
    select timestamp, count(),
    sum(value)
    order by timestamp

    View Slide

  15. Display from Model
    DECOUPLING
    Model from Storage

    View Slide

  16. DIMENSIONS
    decoupling
    SOLR

    View Slide

  17. DIMENSIONS
    decoupling
    SOLR

    View Slide

  18. DIMENSIONS
    decoupling
    SOLR
    >
    Elastic
    ?

    View Slide


  19. Trying to describe a domain using a
    DSL is useful even if the DSL is never
    implemented. It can be beneficial just
    as a platform for communication.
    Martin Fowler, Domain Specific Languages

    View Slide

  20. HOW
    should you design a CQL
    ?

    View Slide

  21. 1. Clarity
    2. Concision
    3. Familiarity
    4. Responsiveness
    WHAT MAKES A GREAT USER INTERFACE?
    5. Consistency
    6. Aesthetics
    7. Efficiency
    8. Forgiveness
    https://www.smashingmagazine.com/user-interface-design-in-modern-web-applications

    View Slide

  22. 1.
    2. Concision
    3. Familiarity
    4.
    the user can easily express what they want
    5.
    6. Aesthetics
    7. Efficiency
    8.

    View Slide

  23. VALO
    historical
    data
    from historical
    /streams/sensors/air
    group by sampleTime
    window of 1 minute
    select sampleTime, sensor,
    avg(pollution)

    View Slide

  24. 1. Clarity
    2.
    3. Familiarity
    4. Responsiveness
    the user doesn’t often need the docs
    5. Consistency
    6.
    7.
    8. Forgiveness
    not

    View Slide

  25. Unknown source 'articles'
    Did you mean one of these
    sources?
    applications
    clinical_trials
    grants
    patents
    publications
    DIMENSIONS
    error messages

    View Slide

  26. it will change
    but no matter how good your design

    View Slide

  27. Get user feedback ASAP
    expect & embrace change (as usual)
    Design for flexibility

    View Slide

  28. HOW
    can you implement a CQL
    ?

    View Slide

  29. PARSING YOUR CQL
    Terrence Parr, The Definitive ANTLR4 Reference

    View Slide

  30. Use a parser generator
    Prototype quickly, giving up
    control over some details of
    lexing/parsing.
    e.g. ANTLR, parboiled2
    PARSING YOUR CQL
    Roll your own
    Retain full control over
    lexing/parsing and internal
    syntactic representation.

    View Slide

  31. Use a parser generator
    Prototype quickly, giving up
    control over some details of
    lexing/parsing.
    e.g. ANTLR, parboiled2
    PARSING YOUR CQL
    Roll your own
    Retain full control over
    lexing/parsing and internal
    syntactic representation.

    View Slide

  32. ANTLR (v4)
    antlr.org
    Java-based parser generator
    with targets for various other
    languages (e.g. Python, JS, …)

    View Slide

  33. THE ANTLR4 FLOW
    grammar
    lexer
    parser
    AST listener

    View Slide

  34. grammar Query;
    query: target results EOF;
    target:
    'search' name filter?;
    results: ('return' name)+;
    name: [a-z]+;
    ANTLR4
    GRAMMAR

    View Slide

  35. # Generated by ANTLR 4.5
    from antlr4 import *
    class QueryListener:
    def enterTarget(self, ctx):
    pass
    def exitTarget(self, ctx):
    pass
    ANTLR4
    LISTENER

    View Slide

  36. Let’s
    RECAP
    !

    View Slide

  37. CUSTOM QUERY LANGUAGES
    WHY bother
    An executable model
    of your domain/logic.
    Decoupling display,
    model, and storage.
    HOW to design
    Language as interface.
    Expect & embrace
    change.
    HOW to implement
    Parser generators can
    speed up prototyping,
    at the cost of some
    control.

    View Slide

  38. GRACIAS!
    I’m @AnjanaVakil
    [email protected]
    Huge thanks to Tobias Johansson & Valo!
    Presentation template by SlidesCarnival

    View Slide