Slide 1

Slide 1 text

CouchDB: NoSQL for Scalable Applications Lorna Mitchell, IBM

Slide 2

Slide 2 text

NoSQL? @lornajane

Slide 3

Slide 3 text

@lornajane

Slide 4

Slide 4 text

@lornajane

Slide 5

Slide 5 text

Document Databases Store collections of schemaless documents @lornajane

Slide 6

Slide 6 text

CouchDB Cluster Of Unreliable Commodity Hardware • HTTP API • JSON data format • MapReduce views in JavaScript • Mango is JSON-format also CouchDB 2.0 has clustering and sharding features @lornajane

Slide 7

Slide 7 text

CouchDB HTTP API Hello, CouchDB ~ $ curl http://localhost:5984 {"couchdb":"Welcome","version":"2.0.0", "vendor":{"name":"The Apache Software Foundation"}} List databases ~ $ curl http://localhost:5984/_all_dbs [] @lornajane

Slide 8

Slide 8 text

CouchDB HTTP API New database ~ $ curl -X PUT http://localhost:5984/shopping {"ok":true} List databases again ~ $ curl http://localhost:5984/_all_dbs ["shopping"] @lornajane

Slide 9

Slide 9 text

Curl and Not-Curl • love curl? (https://curl.haxx.se/) • try jq (https://stedolan.github.io/jq/) • hate curl? Try one of these • http-console https://github.com/cloudhead/http-console • HTTPie https://httpie.org/ • Postman https://www.getpostman.com/ @lornajane

Slide 10

Slide 10 text

Try http-console Connect: $ http-console http://localhost:5984 Set database in path (to avoid typing it lots): http://localhost:5984> /shopping Set JSON header: http://localhost:5984/shopping> Content-Type: application/json @lornajane

Slide 11

Slide 11 text

Try http-console Create item: http://localhost:5984/shopping> PUT /hat ... {"colour": "white"} HTTP/1.1 201 Created Location: http://localhost/shopping/hat Etag: "1-12cb52ad7565b6248c4dbe09f1377e2b" { ok: true, id: 'hat', rev: '1-12cb52ad7565b6248c4dbe09f1377e2b' } @lornajane

Slide 12

Slide 12 text

Try http-console Update item: http://localhost:5984/shopping> PUT /hat ... {"colour": "blue"} HTTP/1.1 409 Conflict { error: 'conflict', reason: 'Document update conflict.' } To update, include the revision ... @lornajane

Slide 13

Slide 13 text

Try http-console Update item: http://localhost:5984/shopping> PUT /hat ... {"_rev": "1-12cb52ad7565b6248c4dbe09f1377e2b", "colour": "blue HTTP/1.1 201 Created Location: http://localhost/shopping/hat Etag: "2-15b01d3c5cd2ed475e6ce4cf84b51990" { ok: true, id: 'hat', rev: '2-15b01d3c5cd2ed475e6ce4cf84b51990' } @lornajane

Slide 14

Slide 14 text

Fauxton Friendly web interface @lornajane

Slide 15

Slide 15 text

Fauxton @lornajane

Slide 16

Slide 16 text

Fauxton @lornajane

Slide 17

Slide 17 text

Changes Feed A feed containing all database changes. GET /_changes @lornajane

Slide 18

Slide 18 text

Replication @lornajane

Slide 19

Slide 19 text

Replication @lornajane

Slide 20

Slide 20 text

Replication • Replication can be in either direction - or both • Can be one-off, or continuous • Other CouchDB-compatible storage also exists • e.g. PouchDB, a JavaScript implementation @lornajane

Slide 21

Slide 21 text

Conflicts Change docs in both places, replicate again: 87bf-bluemix.cloudant.com:443/shopping> GET /hat?conflicts=true { _id: 'hat', _rev: '4-ecbc38075f9a8535c123e523519613b9', colour: 'green', in_stock: 1, _conflicts: [ '3-0bb689d59034fb769d99dcf697ae2de7' ] } CouchDB will always choose the same "winning" doc @lornajane

Slide 22

Slide 22 text

Conflicts Fetch the "losing" doc(s) with ?rev= parameter 87bf-bluemix.cloudant.com:443/shopping> GET /hat?rev=3-0bb689d5903 { _id: 'hat', _rev: '3-0bb689d59034fb769d99dcf697ae2de7', colour: 'blue', size: 'L' } CouchDB doesn't store old revisions forever @lornajane

Slide 23

Slide 23 text

Mango (sample data: https://www.ibm.com/communities/analy tics/watson-analytics-blog/sales-products-sample-data/) @lornajane

Slide 24

Slide 24 text

Mango: CouchDB Queries Mango is a mongo-like query language, useful for ad-hoc querying It is a JSON structure containing: • Selector: the criteria to match records on • Fields: which fields to return • Sort: what order you'd like that in (use with Skip) • Limit: how many records (default = 25) @lornajane

Slide 25

Slide 25 text

Mango: CouchDB Queries @lornajane

Slide 26

Slide 26 text

Mango: Example Query Use a query like this with the _find endpoint { "selector": { "Year": {"$eq": "2012"} }, "fields": ["Quarter", "Product line"], "limit": 5 } @lornajane

Slide 27

Slide 27 text

Mango: Example Query $ curl -X POST -H Content-Type:application/json \ http://localhost:5984/products/_find --data @mango.json {"warning":"no matching index found, create an index to optimize q "docs":[ {"Quarter":"Q1 2012","Product line":"Mountaineering Equipment"}, {"Quarter":"Q1 2012","Product line":"Mountaineering Equipment"}, {"Quarter":"Q1 2012","Product line":"Mountaineering Equipment"}, {"Quarter":"Q1 2012","Product line":"Mountaineering Equipment"}, {"Quarter":"Q1 2012","Product line":"Mountaineering Equipment"} ]} @lornajane

Slide 28

Slide 28 text

Mango: Indexes Describe the index in JSON, then use the _index endpoint { "index": { "fields": ["Year"] }, "name": "Year" } @lornajane

Slide 29

Slide 29 text

Mango: Indexes $ curl -X POST -H Content-Type:application/json \ http://localhost:5984/products/_index --data @index.json { "result": "created", "id": "_design/e9b54f2ac34b8823ccbe8aaf6f406d464f50f521", "name": "Year" } Check which indexes are used by putting _explain where the _find normally goes! @lornajane

Slide 30

Slide 30 text

Views @lornajane

Slide 31

Slide 31 text

Views • Written in Javascript • Use MapReduce • The map results are stored • Can be used either for filtering, or for aggregation • Geospatial features also available (not in today's talk, sorry) @lornajane

Slide 32

Slide 32 text

MapReduce Primer: Map • Examine each document, "emit" 0+ keys/value pairs • Scales well because each document is independent • To filter a collection of documents, use map step only @lornajane

Slide 33

Slide 33 text

MapReduce Primer: Map @lornajane

Slide 34

Slide 34 text

MapReduce Primer: Map @lornajane

Slide 35

Slide 35 text

MapReduce Primer: Map @lornajane

Slide 36

Slide 36 text

MapReduce Primer: Map @lornajane

Slide 37

Slide 37 text

MapReduce Primer: Reduce @lornajane

Slide 38

Slide 38 text

MapReduce Primer: Reduce • "Reduce" values in batches with the same key • CouchDB has useful built in functions for most things • Use reduce step when you want aggregate data • (SQL equivalent: a query with GROUP BY) @lornajane

Slide 39

Slide 39 text

Views Example @lornajane

Slide 40

Slide 40 text

Views Example function (doc) { emit(doc.year, 1); } Reduce: _COUNT ... no really, this is a view URL: http://localhost:5984/products/_design/myview/_view/year @lornajane

Slide 41

Slide 41 text

Views Example Find records sorted by year: ?include_docs=true&reduce=false Find records with a specific year: ?key="2012"&include_docs=true&reduce=false Count number of records per year (uses reduce): ?group=true @lornajane

Slide 42

Slide 42 text

Composite Keys We can emit an array as the key. e.g. turn the "Quarter" field from 2012 Q3 to [2012, 3] function (doc) { year = doc.Quarter.substring(3); quarter = doc.Quarter.substring(1,2); emit([year, quarter], null); } @lornajane

Slide 43

Slide 43 text

With ?group=true {"rows":[ {"key":["2012","1"],"value":8617}, {"key":["2012","2"],"value":8802}, {"key":["2012","3"],"value":8548}, {"key":["2012","4"],"value":8161}, {"key":["2013","1"],"value":8446}, {"key":["2013","2"],"value":8243}, {"key":["2013","3"],"value":8466}, {"key":["2013","4"],"value":7868}, {"key":["2014","1"],"value":8383}, {"key":["2014","2"],"value":8140}, {"key":["2014","3"],"value":4801} ]} @lornajane

Slide 44

Slide 44 text

With ?group_level=1 {"rows":[ {"key":["2012"],"value":34128}, {"key":["2013"],"value":33023}, {"key":["2014"],"value":21324} ]} @lornajane

Slide 45

Slide 45 text

PouchDB: CouchDB Compatible JavaScript @lornajane

Slide 46

Slide 46 text

PouchDB • https://pouchdb.com/ • A database that your client-side javascript can use • Can also sync to CouchDB (-compatible) databases • https://tinyurl.com/fauxton-pouchdb • UI for PouchDB in your browser @lornajane

Slide 47

Slide 47 text

PouchDB @lornajane

Slide 48

Slide 48 text

PouchDB in Action In index.html: shopping.js is where my client-side JavaScript lives Code is here: https://github.com/lornajane/robust-shopping-list @lornajane

Slide 49

Slide 49 text

PouchDB: Replication var db = new PouchDB('shopping'); var remoteDB = new PouchDB('http://localhost:5984/shopping'); window.onload = function() { db.sync(remoteDB, { live: true, retry: true } ).on('change', function (change) { return getItemList().then(function (contents) { document.getElementById('itemList').innerHTML = conten }) }).on('active', function (info) { return getItemList().then(function (contents) { document.getElementById('itemList').innerHTML = conten }); }); @lornajane

Slide 50

Slide 50 text

PouchDB in Action OfflineFirst for robust web applications @lornajane

Slide 51

Slide 51 text

CouchDB: Scalable NoSQL Document database: and so much more @lornajane

Slide 52

Slide 52 text

Questions? Resources: • https://lornajane.net • https://couchdb.apache.org/ • https://pouchdb.com/ • https://offlinefirst.org If you liked it, tweet: @CouchDB, @PouchDB and @IBMCloudant (and of course @austinphp and @lornajane) @lornajane