Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Query Optimization: Go more faster better

Query Optimization: Go more faster better

Presented by Zachary Tong at the Inaugural Elasticsearch Atlanta Meetup.

Elasticsearch Inc

January 15, 2014
Tweet

More Decks by Elasticsearch Inc

Other Decks in Technology

Transcript

  1. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Query Optimization
    Go more faster better

    View full-size slide

  2. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    @ZacharyTong
    polyfractal on IRC
    Developing - Support - Training
    ಠ_ಠ
    (amoeba)

    View full-size slide

  3. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Filters
    Performing binary decisions since 2010

    View full-size slide

  4. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Instead of this…
    {!
    ! “query” : {!
    ! ! “term” : {!
    ! ! ! “my_field” : “value”!
    ! ! }!
    ! }!
    }!

    View full-size slide

  5. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Do this.
    {!
    ! “query” : {!
    ! ! “filtered” : {!
    ! ! ! “query” : {!
    ! ! ! ! “match_all” : {},!
    ! ! ! },!
    ! ! ! “filter” : {!
    ! ! ! ! “term” : {!
    ! ! ! ! ! “my_field” : “value”!
    ! ! ! ! }!
    ! ! ! }!
    ! ! }!
    ! }!
    }!

    View full-size slide

  6. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Filters are fast
    No score is calculated, only inclusion / exclusion

    View full-size slide

  7. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Filters are cached
    fast
    What’s faster than fast? Not calculating it again

    View full-size slide

  8. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Filters are composable
    cached
    fast
    Cached filters are independent of their original query

    View full-size slide

  9. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Filters will short-circuit
    composable
    cached
    If the filter doesn’t match, it isn’t evaluated by a query
    fast

    View full-size slide

  10. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Replace:
    Term Query
    Terms Query
    Range Query
    With:
    Term Filter
    Terms Filter
    Range Filter

    View full-size slide

  11. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Need to combine filters?
    Which do you use?
    And Filter
    Or Filter
    Not Filter
    Bool Filter

    View full-size slide

  12. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    These?
    And Filter
    Or Filter
    Not Filter
    Bool Filter
    Need to combine filters?

    View full-size slide

  13. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Nope. Use this.
    And Filter
    Or Filter
    Not Filter
    Bool Filter
    Need to combine filters?

    View full-size slide

  14. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Why?!
    See this article:
    http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/

    View full-size slide

  15. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Bool
    And/Or/Not
    Geo
    Script
    Everything else

    View full-size slide

  16. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    {!
    ! “query” : {!
    ! ! “filtered” : {!
    ! ! “filter” : {!
    ! ! ! “range” : {!
    ! ! ! ! “my_field” : {!
    ! ! ! ! ! ! “gte” : “now - 1h”!
    ! ! ! ! ! }!
    ! ! ! ! }!
    }!
    ! ! }!
    ! }!
    }!
    Consider “cacheability”
    This is going to cause poor performance

    View full-size slide

  17. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Now
    Filter Cache
    Now never stops moving

    View full-size slide

  18. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Now never stops moving
    Time
    Now
    Filter Cache

    View full-size slide

  19. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    Now
    Now never stops moving

    View full-size slide

  20. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    Now
    Now never stops moving

    View full-size slide

  21. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    Now
    Now never stops moving

    View full-size slide

  22. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    {!
    ! “query” : {!
    ! ! “filtered” : {!
    ! ! ! “filter” : {!
    ! ! ! ! “bool” : {!
    ! ! ! ! ! “must” : [!
    ! ! ! ! ! ! {“range” : {“gte” : “now / 1d”}},!
    ! ! ! ! ! ! {“range” : {!
    ! ! ! ! ! ! ! “my_field” : {!
    ! ! ! ! ! ! ! ! “gte” : “now - 1h”, !
    ! ! ! ! ! ! ! ! “_cached” : false!
    ! ! ! ! ! ! ! }}!
    ! ! ! ! ! ! }!
    ! ! ! ! ! ]!
    ! ! ! ! }!
    ! ! }}}}!
    Add a second filter
    Uncached, hourly granularity
    Cached, daily granularity

    View full-size slide

  23. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    No cache churn
    Now

    View full-size slide

  24. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    No cache churn
    Now

    View full-size slide

  25. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    No cache churn
    Now

    View full-size slide

  26. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Time
    Filter Cache
    No cache churn
    Now
    Applies to many filters, not just ranges

    View full-size slide

  27. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Top level filter is slow(er)
    {!
    ! “query” : { … },!
    ! “filter” : { … }!
    }!
    Don’t use this unless you need it
    (only useful with facets)

    View full-size slide

  28. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    filtered

    query
    top level

    filter
    facet_filter
    documents
    matching
    the query
    “hits”: [...]
    “facets”: {...}

    View full-size slide

  29. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Queries
    This discussion has become relevant

    View full-size slide

  30. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Query-Time
    Choose where to pay a computation price
    Index-Time
    vs

    View full-size slide

  31. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Functionality Query-time Index-time
    Misspellings
    fuzzy query,
    term suggester
    ngrams
    Autocomplete
    prefix query,
    phrase suggester,
    Completion suggester
    shingles
    Leading wildcard Wildcard
    Reverse filter +
    prefix query

    View full-size slide

  32. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Avoid deep pagination
    {!
    ! “query” : { … },!
    ! “from” : 10000000,!
    ! “size” : 10!
    }!
    Builds a PriorityQueue 10,000,010 large
    (for each shard in your index)
    (just to return 10 results)

    View full-size slide

  33. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    GoogleBot
    Bots will happily traverse millions of pages
    Destroyer of clusters

    View full-size slide

  34. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Use Count
    GET /index/_search!
    {!
    ! “query” : { … },!
    ! “size” : 0!
    }!
    This is faster
    GET /index/_search?search_type=count!
    {!
    ! “query” : { … }!
    }!

    View full-size slide

  35. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Rescore API
    1. Query/filter to quickly find top N results
    2. Rescore with complex logic to find top 10

    View full-size slide

  36. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Common Terms
    Very cool query, makes stop-words obsolete
    !
    See this presentation:
    https://speakerdeck.com/polyfractal/common-terms-query

    View full-size slide

  37. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    In General
    1. Think about what you want to search
    2. Structure your document to make that easy

    View full-size slide

  38. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Scripts
    There’s a python in my elasticsearch server!

    View full-size slide

  39. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    _source.my_field
    _fields.my_field
    Do not EVER use these in a search script:
    These access the disk and are sloooooow.
    You will destroy your performance
    FOR ALL THAT IS HOLY

    View full-size slide

  40. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    FOR ALL THAT IS HOLY
    doc[‘my_field’]
    Use this instead:
    Accesses in-memory field data. Fast!

    View full-size slide

  41. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Use common sense
    In general, scripting is slower than queries.
    Don’t go crazy.
    !
    If you end up with a 10-page script, bake
    some of that logic into your index

    View full-size slide

  42. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Questions?
    ಠ_ಠ
    @ZacharyTong
    polyfractal on IRC

    View full-size slide