Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ForwardJS - RethinkDB - Advanced Queries

ForwardJS - RethinkDB - Advanced Queries

Jorge Silva

July 30, 2015
Tweet

More Decks by Jorge Silva

Other Decks in Programming

Transcript

  1. RethinkDB
    Advanced queries
    ForwardJS
    San Francisco, California
    July 30, 2015

    View Slide

  2. Jorge Silva
    @thejsj
    Developer Evangelist @ RethinkDB

    View Slide

  3. Schedule
    1. Map/Reduce and control logic
    2. Joins and subqueries
    3. Indexes
    4. Advanced changefeeds

    View Slide

  4. Part #1
    Map/reduce and control logic

    View Slide

  5. What is map?
    • Transforms each element in a
    sequence by applying a mapping
    function to them

    View Slide

  6. Map
    r.expr([1, 2, 3, 4, 5])
    .map(function (value) {
    return value.mul(2);
    })
    // 2, 4, 6, 8, 10
    Multiply every value by 2

    View Slide

  7. Map
    r.table('cities')
    .map(function (row) {
    // Only get the `name` property
    return row('name')
    });
    // Shortcut
    r.table('cities')('name')
    Only get a specific value

    View Slide

  8. What is reduce?
    • Produce a single value from a
    sequence through repeated
    application of a reduction
    function

    View Slide

  9. Reduce
    r.expr([1, 2, 3, 4, 5])
    .reduce(function (left, right) {
    return left.add(right);
    }) // 15
    // Shortcut
    r.expr([1, 2, 3, 4, 5]).sum()
    Add all values in the array

    View Slide

  10. Map/reduce
    • You can do powerful operations
    by combining map operations
    with reduces operations

    View Slide

  11. Map/reduce
    r.table('cities')
    .map(function (row) {
    // Only get the `name` property
    return row('population')
    })
    .reduce(function (left, right) {
    return left.add(right);
    })
    // Shortcut
    r.table('cities').sum('population')
    Get the total population

    View Slide

  12. Control structures
    • Often times, you need more
    complex logic in your queries,
    such as if statements

    View Slide

  13. Map/reduce
    r.table('cities')('population')
    .reduce(function (left, right) {
    // Only count if more than 1 million
    return r.branch(
    right.gt(1000000),
    left.add(right),
    0
    );
    })
    Get the total population

    View Slide

  14. Install sample data
    https://github.com/thejsj/sample-data

    View Slide

  15. Install sample data
    // Create table
    r.tableCreate('cities')
    // Insert data from JSON files
    r.table('cities')
    .insert(r.json(r.http(
    'https://raw.githubusercontent.com/thejsj/
    sample-data/master/data/urban-areas.json'
    )))
    Get `cities` table

    View Slide

  16. Install sample data
    // Create table
    r.tableCreate('countries')
    // Insert data from JSON files
    r.table('countries')
    .insert(r.json(r.http(
    'https://raw.githubusercontent.com/thejsj/
    sample-data/master/data/countries.json'
    )))
    Get `countries` table

    View Slide

  17. Writing queries
    http://localhost:8080/
    http://rethinkdb.thejsj.com:8080/

    View Slide

  18. Part #2
    Joins and subqueries

    View Slide

  19. Joins
    • RethinkDB allows for joins
    • There are three commands for
    joins: `eqJoin`, `innerJoin`, and
    `outerJoin`

    View Slide

  20. Joins
    r.table('cities')
    .eqJoin('country', r.table('countries'))
    // Returns `left` and `right` properties
    .map(function (row) {
    // Merge `right` into `left`
    return row('left')
    .merge({ 'country': row('right') })
    })
    Get best friend from `people`

    View Slide

  21. Joins
    r.table('cities')
    .eqJoin('country', r.table('countries'))
    // Returns `left` and `right` properties
    .map(function (row) {
    // Merge `right` into `left`
    return row('left')
    .merge({ 'country': row('right') })
    })
    Get `cities` table

    View Slide

  22. Joins
    r.table('cities')
    .eqJoin('country', r.table('countries'))
    // Returns `left` and `right` properties
    .map(function (row) {
    // Merge `right` into `left`
    return row('left')
    .merge({ 'country': row('right') })
    })
    Join each city to its country

    View Slide

  23. Joins
    r.table('cities')
    .eqJoin('country', r.table('countries'))
    // Returns `left` and `right` properties
    .map(function (row) {
    // Merge `right` into `left`
    return row('left')
    .merge({ 'country': row('right') })
    })
    Map every joined row

    View Slide

  24. Joins
    r.table('cities')
    .eqJoin('country', r.table('countries'))
    // Returns `left` and `right` properties
    .map(function (row) {
    // Merge `right` into `left`
    return row('left')
    .merge({ 'country': row('right') })
    })
    Assign `country` property

    View Slide

  25. RethinkDB has something more
    powerful than joins:
    Subqueries!

    View Slide

  26. Subqueries
    • RethinkDB allows for joins and
    subqueries
    • You can execute queries inside
    queries inside queries inside
    queries

    View Slide

  27. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Get `country` for every city

    View Slide

  28. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Get `cities` table

    View Slide

  29. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Merge new object into row

    View Slide

  30. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Return object with `country`

    View Slide

  31. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Get single row from `countries`

    View Slide

  32. Subqueries
    r.table('cities')
    .merge((row) => {
    return {
    'country': r.table('countries')
    .get(row('country'))
    }
    })
    Get row found in `country`

    View Slide

  33. Writing queries
    http://localhost:8080/
    http://rethinkdb.thejsj.com:8080/

    View Slide

  34. Part #3
    Indexes

    View Slide

  35. Indexes
    • Indexes are a way to pre-compute
    the result of a query at write time
    in order to make reading date
    faster
    • There are two types of indexes:
    primary and secondary

    View Slide

  36. Creating an index
    // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Oakland', state: 'CA' },
    { name: 'Chicago', state: 'IL' },
    { name: 'Missoula', state: 'MT' },
    { name: 'Richmond', state: 'VA' }
    ]
    // Create index on `name` property
    r.table('cities').indexCreate('name')

    View Slide

  37. Using an index
    // Non-indexed operation (Slow!)
    r.table('cities')
    .filter(r.row('name').eq('Oakland'))
    // Indexed operation (Fast!)
    r.table('cities')
    .getAll('Oakland', { index: 'name' })
    Operations return same result

    View Slide

  38. Indexed Operations
    • get: get a single document by its
    primary key
    • getAll: get all documents who's
    key matches the key given
    • between: get all documents with a
    key between two values

    View Slide

  39. get
    • The get command is used to get a
    document by its primary index
    • Primary keys must be unique. The
    database is ensured by the
    database.

    View Slide

  40. get
    • In RethinkDB, any property can be
    used as the primary key. This is
    set when creating a table
    • Numbers, strings, booleans,
    datetime objects, and arrays can
    all be use as primaries

    View Slide

  41. Get
    // Cities
    [
    { id: 1, name: 'Oakland', state: 'CA' },
    { id: 2, name: 'Chicago', state: 'IL' },
    { id: 3, name: 'Richmond', state: 'VA' }
    ]
    // Create index on `name` property
    r.tableCreate('cities')
    r.table('cities').get(1)
    get document with id `1`

    View Slide

  42. Get
    // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Oakland', state: 'CA' },
    { name: 'Chicago', state: 'IL' }
    ]
    // Create index on `name` property
    r.tableCreate('cities',
    { primaryKey: 'name' })
    r.table('cities').get('Chicago')
    get documents with name 'Chicago'

    View Slide

  43. getAll
    • The getAll command is used to get
    all documents with a specific
    key(s), given an index
    • Multiple values can be passed

    View Slide

  44. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Oakland', state: 'CA' },
    { name: 'Chicago', state: 'IL' },
    { name: 'Richmond', state: 'VA' }
    ]
    // Create index on `state` property
    r.table('cities').indexCreate('state')
    // Returns: San Francisco, Oakland
    r.table('cities')
    .getAll('CA', { index: 'state' })
    getAll
    get document with state 'CA'

    View Slide

  45. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Oakland', state: 'CA' },
    { name: 'Chicago', state: 'IL' },
    { name: 'Richmond', state: 'VA' }
    ]
    // Returns: San Francisco, Oakland, Chicago
    r.table('cities')
    .getAll('CA', 'IL', { index: 'state' })
    getAll
    get documents with state 'CA' or 'IL'

    View Slide

  46. between
    • The between command is used to
    get all documents with a key
    between two values

    View Slide

  47. // Cities
    [
    { name: 'San Francisco', population: 800000 },
    { name: 'Oakland', population: 400000 },
    { name: 'Chicago', population: 900000 },
    { name: 'Richmond', population: 200000 }
    ]
    // Create index on `population` property
    r.table('cities').indexCreate('population')
    // Returns: Oakland, Richmond
    r.table('cities')
    .between(0, 600000, { index: 'population' })
    between
    get documents with pop under 600K

    View Slide

  48. Types of secondary indexes
    • Simple indexes
    • Compound indexes
    • Multi-indexes
    • Indexes based on arbitrary
    expressions

    View Slide

  49. Compound Indexes
    • Compound indexes allow you to
    retrieve documents by multiple
    fields

    View Slide

  50. // Cities
    [
    { name: 'San Francisco', population: 800000 },
    { name: 'Oakland', population: 400000 },
    { name: 'Chicago', population: 900000 },
    { name: 'Richmond', population: 200000 }
    ]
    // Create index `name` and `population`
    r.table('cities').indexCreate('name_pop',
    [ r.row('name'), r.row('population') ])
    // Returns: San Francisco, Richmond
    r.table('cities')
    .between([ 'R' , 100000],
    [ r.maxval, r.maxval ], { index: 'name_pop' })
    Compound index

    View Slide

  51. Multi Indexes
    • With a multi index, a document
    can be indexed using more than
    one key in the same index.
    • For instance, a blog post might
    have multiple tags, and each tag
    might refer to multiple blog posts.

    View Slide

  52. // Cities
    [
    { name: 'San Francisco', highways: [ 80, 280 ]},
    { name: 'Oakland', highways: [ 80, 880 ]},
    { name: 'Chicago', highways: [ 90, 55 ]},
    { name: 'Richmond', highways: [ 95, 295 ]}
    ]
    // Create index for `highways`
    r.table('cities').indexCreate('highways',
    { multi: true })
    // Returns: San Francisco, Oakland
    r.table('cities')
    .getAll(80, { index: 'highways' })
    Multi indexes
    get documents with highway 80

    View Slide

  53. Arbitrary expression
    • You can create an index on an
    arbitrary expression by passing an
    anonymous function to
    indexCreate
    • Basically, any function can be
    used as an index

    View Slide

  54. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Oakland', state: 'CA' },
    { name: 'Chicago', state: 'IL' },
    { name: 'Richmond', state: 'VA' }
    ]
    // Create an index for cities beginning with 'San'
    r.table('cities').indexCreate('san',
    function (row) {
    return row('name').match("^San").ne(null); })
    // Returns: San Francisco, Oakland
    r.table('cities')
    .getAll(true, { index: 'san' })
    Arbitrary expression

    View Slide

  55. Writing queries
    http://localhost:8080/
    http://rethinkdb.thejsj.com:8080/

    View Slide

  56. Part #4
    Advanced changefeeds

    View Slide

  57. Advanced changefeeds
    • You can attach some terms to a
    query before calling `.changes`
    • Some of these terms are: filter,
    pluck, map, min, max, between,
    orderBy

    View Slide

  58. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Chicago', state: 'IL' }
    ]
    // Will only run when 'Chicago' is modified
    r.table('cities')
    .filter({ state: 'IL' })
    .changes()
    Advanced changefeeds
    listen for cities in Illinois

    View Slide

  59. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Chicago', state: 'IL' }
    ]
    // Will only run when 'name' is modified
    r.table('cities')
    .pluck('name')
    .changes()
    Advanced changefeeds
    Listen for when 'name' changes

    View Slide

  60. // Will only run when 'name' is modified and
    // state is 'IL'
    r.table('cities')
    .filter({ state: 'IL' })
    .pluck('name')
    .changes()
    Advanced changefeeds
    Combine both listeners

    View Slide

  61. Advanced changefeeds
    • You can manipulate the result of a
    changefeed by adding terms after
    `.changes`

    View Slide

  62. // Cities
    [
    { name: 'San Francisco', state: 'CA' },
    { name: 'Chicago', state: 'IL' }
    ]
    // Will only return the new_val
    r.table('cities')
    .changes()('new_val')
    Advanced changefeeds
    Only get the update value

    View Slide

  63. // Will append the state
    r.table('cities')
    .changes()
    .filter(r.row('new_val').ne(null))
    .map(function (row) {
    return {
    'state': r.table('states')
    .get(row('state')
    }
    })
    Advanced changefeeds
    Join the state to changefeed

    View Slide

  64. // Return the number of cities in table
    r.table('cities')
    .changes()
    .map(function (row) {
    return r.table('cities').count()
    })
    Advanced changefeeds
    Get count of cities on change

    View Slide

  65. Writing queries
    http://localhost:8080/
    http://rethinkdb.thejsj.com:8080/

    View Slide

  66. Questions
    • RethinkDB website:

    http://rethinkdb.com
    • Install RethinkDB:

    http://rethinkdb.com/install/
    • Email me: [email protected]
    • Tweet: @thejsj, @rethinkdb

    View Slide