Slide 1

Slide 1 text

RethinkDB Advanced queries ForwardJS San Francisco, California July 30, 2015

Slide 2

Slide 2 text

Jorge Silva @thejsj Developer Evangelist @ RethinkDB

Slide 3

Slide 3 text

Schedule 1. Map/Reduce and control logic 2. Joins and subqueries 3. Indexes 4. Advanced changefeeds

Slide 4

Slide 4 text

Part #1 Map/reduce and control logic

Slide 5

Slide 5 text

What is map? • Transforms each element in a sequence by applying a mapping function to them

Slide 6

Slide 6 text

Map r.expr([1, 2, 3, 4, 5]) .map(function (value) { return value.mul(2); }) // 2, 4, 6, 8, 10 Multiply every value by 2

Slide 7

Slide 7 text

Map r.table('cities') .map(function (row) { // Only get the `name` property return row('name') }); // Shortcut r.table('cities')('name') Only get a specific value

Slide 8

Slide 8 text

What is reduce? • Produce a single value from a sequence through repeated application of a reduction function

Slide 9

Slide 9 text

Reduce r.expr([1, 2, 3, 4, 5]) .reduce(function (left, right) { return left.add(right); }) // 15 // Shortcut r.expr([1, 2, 3, 4, 5]).sum() Add all values in the array

Slide 10

Slide 10 text

Map/reduce • You can do powerful operations by combining map operations with reduces operations

Slide 11

Slide 11 text

Map/reduce r.table('cities') .map(function (row) { // Only get the `name` property return row('population') }) .reduce(function (left, right) { return left.add(right); }) // Shortcut r.table('cities').sum('population') Get the total population

Slide 12

Slide 12 text

Control structures • Often times, you need more complex logic in your queries, such as if statements

Slide 13

Slide 13 text

Map/reduce r.table('cities')('population') .reduce(function (left, right) { // Only count if more than 1 million return r.branch( right.gt(1000000), left.add(right), 0 ); }) Get the total population

Slide 14

Slide 14 text

Install sample data https://github.com/thejsj/sample-data

Slide 15

Slide 15 text

Install sample data // Create table r.tableCreate('cities') // Insert data from JSON files r.table('cities') .insert(r.json(r.http( 'https://raw.githubusercontent.com/thejsj/ sample-data/master/data/urban-areas.json' ))) Get `cities` table

Slide 16

Slide 16 text

Install sample data // Create table r.tableCreate('countries') // Insert data from JSON files r.table('countries') .insert(r.json(r.http( 'https://raw.githubusercontent.com/thejsj/ sample-data/master/data/countries.json' ))) Get `countries` table

Slide 17

Slide 17 text

Writing queries http://localhost:8080/ http://rethinkdb.thejsj.com:8080/

Slide 18

Slide 18 text

Part #2 Joins and subqueries

Slide 19

Slide 19 text

Joins • RethinkDB allows for joins • There are three commands for joins: `eqJoin`, `innerJoin`, and `outerJoin`

Slide 20

Slide 20 text

Joins r.table('cities') .eqJoin('country', r.table('countries')) // Returns `left` and `right` properties .map(function (row) { // Merge `right` into `left` return row('left') .merge({ 'country': row('right') }) }) Get best friend from `people`

Slide 21

Slide 21 text

Joins r.table('cities') .eqJoin('country', r.table('countries')) // Returns `left` and `right` properties .map(function (row) { // Merge `right` into `left` return row('left') .merge({ 'country': row('right') }) }) Get `cities` table

Slide 22

Slide 22 text

Joins r.table('cities') .eqJoin('country', r.table('countries')) // Returns `left` and `right` properties .map(function (row) { // Merge `right` into `left` return row('left') .merge({ 'country': row('right') }) }) Join each city to its country

Slide 23

Slide 23 text

Joins r.table('cities') .eqJoin('country', r.table('countries')) // Returns `left` and `right` properties .map(function (row) { // Merge `right` into `left` return row('left') .merge({ 'country': row('right') }) }) Map every joined row

Slide 24

Slide 24 text

Joins r.table('cities') .eqJoin('country', r.table('countries')) // Returns `left` and `right` properties .map(function (row) { // Merge `right` into `left` return row('left') .merge({ 'country': row('right') }) }) Assign `country` property

Slide 25

Slide 25 text

RethinkDB has something more powerful than joins: Subqueries!

Slide 26

Slide 26 text

Subqueries • RethinkDB allows for joins and subqueries • You can execute queries inside queries inside queries inside queries

Slide 27

Slide 27 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Get `country` for every city

Slide 28

Slide 28 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Get `cities` table

Slide 29

Slide 29 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Merge new object into row

Slide 30

Slide 30 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Return object with `country`

Slide 31

Slide 31 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Get single row from `countries`

Slide 32

Slide 32 text

Subqueries r.table('cities') .merge((row) => { return { 'country': r.table('countries') .get(row('country')) } }) Get row found in `country`

Slide 33

Slide 33 text

Writing queries http://localhost:8080/ http://rethinkdb.thejsj.com:8080/

Slide 34

Slide 34 text

Part #3 Indexes

Slide 35

Slide 35 text

Indexes • Indexes are a way to pre-compute the result of a query at write time in order to make reading date faster • There are two types of indexes: primary and secondary

Slide 36

Slide 36 text

Creating an index // Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Oakland', state: 'CA' }, { name: 'Chicago', state: 'IL' }, { name: 'Missoula', state: 'MT' }, { name: 'Richmond', state: 'VA' } ] // Create index on `name` property r.table('cities').indexCreate('name')

Slide 37

Slide 37 text

Using an index // Non-indexed operation (Slow!) r.table('cities') .filter(r.row('name').eq('Oakland')) // Indexed operation (Fast!) r.table('cities') .getAll('Oakland', { index: 'name' }) Operations return same result

Slide 38

Slide 38 text

Indexed Operations • get: get a single document by its primary key • getAll: get all documents who's key matches the key given • between: get all documents with a key between two values

Slide 39

Slide 39 text

get • The get command is used to get a document by its primary index • Primary keys must be unique. The database is ensured by the database.

Slide 40

Slide 40 text

get • In RethinkDB, any property can be used as the primary key. This is set when creating a table • Numbers, strings, booleans, datetime objects, and arrays can all be use as primaries

Slide 41

Slide 41 text

Get // Cities [ { id: 1, name: 'Oakland', state: 'CA' }, { id: 2, name: 'Chicago', state: 'IL' }, { id: 3, name: 'Richmond', state: 'VA' } ] // Create index on `name` property r.tableCreate('cities') r.table('cities').get(1) get document with id `1`

Slide 42

Slide 42 text

Get // Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Oakland', state: 'CA' }, { name: 'Chicago', state: 'IL' } ] // Create index on `name` property r.tableCreate('cities', { primaryKey: 'name' }) r.table('cities').get('Chicago') get documents with name 'Chicago'

Slide 43

Slide 43 text

getAll • The getAll command is used to get all documents with a specific key(s), given an index • Multiple values can be passed

Slide 44

Slide 44 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Oakland', state: 'CA' }, { name: 'Chicago', state: 'IL' }, { name: 'Richmond', state: 'VA' } ] // Create index on `state` property r.table('cities').indexCreate('state') // Returns: San Francisco, Oakland r.table('cities') .getAll('CA', { index: 'state' }) getAll get document with state 'CA'

Slide 45

Slide 45 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Oakland', state: 'CA' }, { name: 'Chicago', state: 'IL' }, { name: 'Richmond', state: 'VA' } ] // Returns: San Francisco, Oakland, Chicago r.table('cities') .getAll('CA', 'IL', { index: 'state' }) getAll get documents with state 'CA' or 'IL'

Slide 46

Slide 46 text

between • The between command is used to get all documents with a key between two values

Slide 47

Slide 47 text

// Cities [ { name: 'San Francisco', population: 800000 }, { name: 'Oakland', population: 400000 }, { name: 'Chicago', population: 900000 }, { name: 'Richmond', population: 200000 } ] // Create index on `population` property r.table('cities').indexCreate('population') // Returns: Oakland, Richmond r.table('cities') .between(0, 600000, { index: 'population' }) between get documents with pop under 600K

Slide 48

Slide 48 text

Types of secondary indexes • Simple indexes • Compound indexes • Multi-indexes • Indexes based on arbitrary expressions

Slide 49

Slide 49 text

Compound Indexes • Compound indexes allow you to retrieve documents by multiple fields

Slide 50

Slide 50 text

// Cities [ { name: 'San Francisco', population: 800000 }, { name: 'Oakland', population: 400000 }, { name: 'Chicago', population: 900000 }, { name: 'Richmond', population: 200000 } ] // Create index `name` and `population` r.table('cities').indexCreate('name_pop', [ r.row('name'), r.row('population') ]) // Returns: San Francisco, Richmond r.table('cities') .between([ 'R' , 100000], [ r.maxval, r.maxval ], { index: 'name_pop' }) Compound index

Slide 51

Slide 51 text

Multi Indexes • With a multi index, a document can be indexed using more than one key in the same index. • For instance, a blog post might have multiple tags, and each tag might refer to multiple blog posts.

Slide 52

Slide 52 text

// Cities [ { name: 'San Francisco', highways: [ 80, 280 ]}, { name: 'Oakland', highways: [ 80, 880 ]}, { name: 'Chicago', highways: [ 90, 55 ]}, { name: 'Richmond', highways: [ 95, 295 ]} ] // Create index for `highways` r.table('cities').indexCreate('highways', { multi: true }) // Returns: San Francisco, Oakland r.table('cities') .getAll(80, { index: 'highways' }) Multi indexes get documents with highway 80

Slide 53

Slide 53 text

Arbitrary expression • You can create an index on an arbitrary expression by passing an anonymous function to indexCreate • Basically, any function can be used as an index

Slide 54

Slide 54 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Oakland', state: 'CA' }, { name: 'Chicago', state: 'IL' }, { name: 'Richmond', state: 'VA' } ] // Create an index for cities beginning with 'San' r.table('cities').indexCreate('san', function (row) { return row('name').match("^San").ne(null); }) // Returns: San Francisco, Oakland r.table('cities') .getAll(true, { index: 'san' }) Arbitrary expression

Slide 55

Slide 55 text

Writing queries http://localhost:8080/ http://rethinkdb.thejsj.com:8080/

Slide 56

Slide 56 text

Part #4 Advanced changefeeds

Slide 57

Slide 57 text

Advanced changefeeds • You can attach some terms to a query before calling `.changes` • Some of these terms are: filter, pluck, map, min, max, between, orderBy

Slide 58

Slide 58 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Chicago', state: 'IL' } ] // Will only run when 'Chicago' is modified r.table('cities') .filter({ state: 'IL' }) .changes() Advanced changefeeds listen for cities in Illinois

Slide 59

Slide 59 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Chicago', state: 'IL' } ] // Will only run when 'name' is modified r.table('cities') .pluck('name') .changes() Advanced changefeeds Listen for when 'name' changes

Slide 60

Slide 60 text

// Will only run when 'name' is modified and // state is 'IL' r.table('cities') .filter({ state: 'IL' }) .pluck('name') .changes() Advanced changefeeds Combine both listeners

Slide 61

Slide 61 text

Advanced changefeeds • You can manipulate the result of a changefeed by adding terms after `.changes`

Slide 62

Slide 62 text

// Cities [ { name: 'San Francisco', state: 'CA' }, { name: 'Chicago', state: 'IL' } ] // Will only return the new_val r.table('cities') .changes()('new_val') Advanced changefeeds Only get the update value

Slide 63

Slide 63 text

// Will append the state r.table('cities') .changes() .filter(r.row('new_val').ne(null)) .map(function (row) { return { 'state': r.table('states') .get(row('state') } }) Advanced changefeeds Join the state to changefeed

Slide 64

Slide 64 text

// Return the number of cities in table r.table('cities') .changes() .map(function (row) { return r.table('cities').count() }) Advanced changefeeds Get count of cities on change

Slide 65

Slide 65 text

Writing queries http://localhost:8080/ http://rethinkdb.thejsj.com:8080/

Slide 66

Slide 66 text

Questions • RethinkDB website:
 http://rethinkdb.com • Install RethinkDB:
 http://rethinkdb.com/install/ • Email me: [email protected] • Tweet: @thejsj, @rethinkdb