MongoDB and REST APIs A Match Made in Heaven

MongoDB and REST APIs A Match Made in Heaven

Exposing MongoDB over the internet through a RESTful API is becoming a common pattern, and for very good reasons. Your REST API provides a nice layer of abstraction, isolation and validation above the actual datastore while providing access to all kind of clients. At the same time MongoDB, with its BSON store, is ideal for serving data over the Internet. But designing first, and then building a robust and scalable REST API is not an easy task.

EVE is an open source framework allowing anyone to effortlessly expose MongoDB data sources over highly customizable, fully featured RESTful Web Services. Initially developed to solve our own internal use case, EVE has been getting a lot of traction since its open source release in 2012.

In this talk I will go through the reasons that make MongoDB a good match for most RESTful APIs. I will then show some prominent EVE features and illustrate how one can quickly and easily go online with a RESTful MongoDB front-end. I also plan to illustrate some common pitfalls on MongoDB+EVE/REST deployments.

E3550767c858c787c35c280047ff789c?s=128

Nicola Iarocci

October 16, 2015
Tweet

Transcript

  1. Good Morning Percona Live Amsterdam 2015

  2. Who Am I fighting back against impostor syndrome

  3. Nicola Iarocci Co-founder and CTO at CIR 2000

  4. Nicola Iarocci MongoDB Master

  5. Nicola Iarocci Open Source junkie Eve • Cerberus • Events

    • Flask-Sentinel • Eve.NET • Etc.
  6. Nicola Iarocci Consultant Mongo • RESTful Services • Python •

    My Open Source Projects
  7. Nicola Iarocci CoderDojo Coding Clubs for Kids

  8. MongoDB & REST APIs A Match Made in Heaven

  9. Agenda

  10. Agenda 1. Our use case for a RESTful API

  11. Agenda 2. What is a RESTful API and why we

    need it
  12. Agenda 3. Why MongoDB is a good match for RESTful

    Services
  13. Agenda 4. Build and run a MongoDB RESTful Service from

    scratch, live on stage.
  14. Agenda 5. Stories from the field (if there’s any time

    left, which I doubt)
  15. The Case for RESTful Web APIs

  16. Amica 10 invoicing & accounting for italian small businesses

  17. your old school desktop app

  18. what we started with Client LAN/SQL Database Desktop Application

  19. Goal A remote service that client apps can leverage to

    stay in sync withc each other
  20. What we need #1 Must be accessible by any kind

    of client technology
  21. What we need #2 Abstract the data access layer so

    we can update/replace the engine at any time with no impact on clients
  22. What we need #3 An appropriate data storage engine

  23. What we need #4 Easily (re)deployable and scalable multi-micro-service architecure

  24. Where we want to go Clients “Cloud” Database RESTful Web

    API API iOS Android Website Desktop Client ? ?
  25. Constraints • minimum viable product first • add features over

    time • frequent database schema updates • easily scalable • avoid downtime as much as possible • cater upfront for a microservices architecture
  26. REST So What Is REST All About?

  27. REST is not a standard

  28. REST is not a protocol

  29. REST is an architectural style for networked applications

  30. Defines a set of simple principles loosely followed by most

    API implementations
  31. “resource” the source of a specific information

  32. A web page is not a resource rather the representation

    of a resource
  33. “global permanent identifier” every resource is uniquely identified. Think a

    HTTP URI.
  34. #3 standard interface used to exchange representations of resources (think

    the HTTP protocol)
  35. “a set of constraints” separation of concerns, stateless, cacheability, layered

    system, uniform interface, etc.
  36. Web is built on REST and it is meant to

    be consumed by humans
  37. RESTful APIs are built on REST and are meant to

    be consumed by machines
  38. Representational State Transfer (REST) by Roy Thomas Fielding http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

  39. Goals #1 and #2 are met REST layer allows all

    kinds of client technologies and abstracts the data away
  40. MongoDB and REST Or why we picked MongoDB for our

    REST API
  41. JSON transport Most REST services and clients produce and consume

    use JSON
  42. JSON-style data store MongoDB stores data as Binary JSON

  43. JSON & RESTful API JSON accepted media type Client JSON

    (BSON) Mongo GET maybe we can push directly to client?
  44. JSON & RESTful API JSON accepted media type Client JSON

    (BSON) Mongo JSON subset of python dict (kinda) API GET almost.
  45. JSON & RESTful API JSON objects Client JSON (BSON) Mongo

    JSON/dict maps to python dict (validation layer) API POST also works when sending data the database
  46. Similarity with RDBMS makes NoSQL easy to grasp (even for

    a sql head like me)
  47. Terminology RDBMS Mongo Database Database Table Collection Rows(s) JSON Document

    Index Index Join Embedding & Linking
  48. What about Queries? Queries in MongoDB are represented as JSON-style

    objects db.things.find({x: 3, y: "foo”});
  49. Filtering and Sorting native Mongo query syntax Client JSON (BSON)

    Mongo (very) thin parsing & validation layer API Expose the native MongoDB syntax? ?where={“x”: 3, “y”: “foo”}
  50. JSON all along the pipeline mapping to and from the

    database feels more natural
  51. No need for ORM No need to map objects to

    JSON and vice-versa (win!)
  52. schema-less dynamic documents allow for painless evolution

  53. REST is stateless MongoDB lacks transactions

  54. Ideal API Surface Mongo collection maps to API resource endpoint

    api.example.com/contacts Maps to a Mongo collection
  55. Ideal API Surface Mongo document maps to a API document

    endpoint api.example.com/contacts/4f46445fc88e201858000000 Maps to a collection ObjectID
  56. Goal #3 is met An appropriate data storage engine: MongoDB

  57. Eve REST API for Humans™ Free and Open Source Powered

    by MongoDB and Good Intentions eve
  58. Philosopy effortlessly build and deploy highly customizable, fully featured RESTful

    Web Services
  59. Quickstart

  60. install $ pip install eve

  61. run.py from eve import Eve app = Eve() if __name__

    == '__main__': app.run()
  62. settings.py # just a couple API endpoints with no custom

    schema or rules. DOMAIN = { ‘people’: {} ‘works’: {} }
  63. launch $ python run.py * Running on http://127.0.0.1:5000/

  64. enjoy HATEOAS AT WORK HERE $ curl http://localhost:5000/people { "_items":

    [], "_links": { "self": {"href": "people", "title": "people"}, "parent": {“href": "/", "title": "home"}, }, "_meta": { "max_results": 25, "total": 0, "page": 1 }}
  65. enjoy CLIENTS CAN EXPLORE THE API PROGRAMMATICALLY $ curl http://localhost:5000/people

    { "_items": [], "_links": { "self": {"href": "people", "title": "people"}, "parent": {“href": "/", "title": “home”} }, "_meta": { "max_results": 25, "total": 0, "page": 1 }}
  66. $ curl http://localhost:5000/people { "_items": [], "_links": { "self": {"href":

    "people", "title": "people"}, "parent": {“href": "/", "title": "home"}, }, "_meta": { "max_results": 25, "total": 0, "page": 1 }} enjoy AND EVENTUALLY FILL THE UI
  67. enjoy $ curl http://localhost:5000/people { "_items": [], "_links": { "self":

    {"href": "people", "title": "people"}, "parent": {“href": "/", "title": "home"}, }, "_meta": { "max_results": 25, "total": 0, "page": 1 }} PAGINATION DATA
  68. enjoy EMTPY RESOURCE AS WE DID NOT CONNECT A DATASOURCE

    $ curl http://localhost:5000/people { "_items": [], "_links": { "self": {"href": "people", "title": "people"}, "parent": {“href": "/", "title": "home"}, }, "_meta": { "max_results": 25, "total": 0, "page": 1 }}
  69. settings.py # let’s connect to a mongo instance MONGO_HOST =

    'localhost' MONGO_PORT = 27017 MONGO_USERNAME = 'user' MONGO_PASSWORD = 'user' MONGO_DBNAME = 'percona'
  70. settings.py # let’s also add a few validation rules DOMAIN['people']['schema']

    = { 'name': { 'type': 'string’, 'maxlength': 50, 'unique': True}, 'email': { 'type': 'string', 'regex': '^\S+@\S+$'}, 'location': { 'type': 'dict', 'schema': { 'address': {'type': 'string'}, 'city': {'type': 'string’}}}, 'born': {'type': 'datetime'}} UNIQUE STRING, MAX LENGTH 50
  71. settings.py # let’s also add a few validation rules DOMAIN['people']['schema']

    = { 'name': { 'type': 'string', 'maxlength': 50, 'unique': True}, 'email': { 'type': 'string', 'regex': '^\S+@\S+$'}, 'location': { 'type': 'dict', 'schema': { 'address': {'type': 'string'}, 'city': {'type': 'string’}}}, 'born': {'type': 'datetime'}} ONLY ACCEPT VALID EMAILS
  72. settings.py # let’s also add a few validation rules DOMAIN['people']['schema']

    = { 'name': { 'type': 'string’, 'maxlength': 50, 'unique': True}, 'email': { 'type': 'string', 'regex': '^\S+@\S+$'}, 'location': { 'type': 'dict', 'schema': { 'address': {'type': 'string'}, 'city': {'type': 'string’}}}, 'born': {'type': 'datetime'}} THIS REGEX SUCKS!
 DON’T USE IN PRODUCTION
  73. settings.py # let’s also add a few validation rules DOMAIN['people']['schema']

    = { 'name': { 'type': 'string', 'maxlength': 50, 'unique': True}, 'email': { 'type': 'string', 'regex': '^\S+@\S+$'}, 'location': { 'type': 'dict', 'schema': { 'address': {'type': 'string'}, 'city': {'type': 'string’}}}, 'born': {'type': 'datetime'}} SUBDOCUMENT WITH 2 STRING FIELDS
  74. settings.py # let’s also add a few validation rules DOMAIN['people']['schema']

    = { 'name': { 'type': 'string’, 'maxlength': 50, 'unique': True}, 'email': { 'type': 'string', 'regex': '^\S+@\S+$'}, 'location': { 'type': 'dict', 'schema': { 'address': {'type': 'string'}, 'city': {'type': 'string’}}}, 'born': {'type': 'datetime'}} ONLY ACCEPT DATETIME VALUES
  75. settings.py # allow write access to API endpoints # (default

    is [‘GET’] for both settings) # /people RESOURCE_METHODS = ['GET','POST'] # /people/<id> ITEM_METHODS = ['GET','PATCH','PUT','DELETE'] ADD/CREATE ONE OR MORE ITEMS
  76. settings.py # allow write access to API endpoints # (default

    is [‘GET’] for both settings) # /people RESOURCE_METHODS = ['GET','POST'] # /people/<id> ITEM_METHODS = ['GET','PATCH','PUT','DELETE'] EDIT ITEM
  77. settings.py # allow write access to API endpoints # (default

    is [‘GET’] for both settings) # /people RESOURCE_METHODS = ['GET','POST'] # /people/<id> ITEM_METHODS = ['GET','PATCH','PUT','DELETE'] REPLACE ITEM
  78. settings.py # allow write access to API endpoints # (default

    is [‘GET’] for both settings) # /people RESOURCE_METHODS = ['GET','POST'] # /people/<id> ITEM_METHODS = ['GET','PATCH','PUT','DELETE'] YOU GUESSED IT
  79. settings.py CLIENT UI # a few optional config options DOMAIN['people'].update(

    { 'item_title': 'person', 'cache_control':'max-age=10,must-revalidate', 'cache_expires': 10, 'additional_lookup': { 'url’: 'regex("[\w]+")', 'field’: 'name'} } } )
  80. settings.py CLIENT CACHE OPTIONS # a few optional config options

    DOMAIN['people'].update( { 'item_title': 'person', 'cache_control':'max-age=10,must-revalidate', 'cache_expires': 10, 'additional_lookup': { 'url’: 'regex("[\w]+")', 'field’: 'name'} } } )
  81. settings.py ALTERNATE ENDPOINT # a few optional config options DOMAIN['people'].update(

    { 'item_title': 'person', 'cache_control':'max-age=10,must-revalidate', 'cache_expires': 10, 'additional_lookup': { 'url’: 'regex("[\w]+")', 'field’: 'name'} } } )
  82. Features Overview We are going to focus on MongoDB power-ups

  83. full range of CRUD operations Create/POST Read/GET Update/PATCH and Replace/PUT

    Delete/DELETE
  84. filters, mongo style ?where={“name”: “john”}

  85. filters, the python way ?where=name==john

  86. sorting ?sort=city,-name SORT BY CITY, THEN NAME DESCENDING

  87. projections ?projection={"avatar": 0} RETURN ALL FIELDS BUT ‘AVATAR’

  88. projections ?projection={"lastname": 1} ONLY RETURN ‘LASTNAME’

  89. pagination ?max_results=20&page=2 MAX 20 RESULTS PER PAGE; PAGE 2

  90. GeoJSON support and validation for all GeoJSON types Point, LineString,

    Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometricalCollection
  91. document embedding joins

  92. standard request $ curl example.com/works/<id> {
 "title": "Book Title",
 "description":

    "book description",
 "author": “52da465a5610320002660f94"
 } RAW FOREIGN KEY (DEFAULT)
  93. request an embedded document $ curl example.com/works/<id>?embedded={“author”: 1} {
 "title":

    "Book Title",
 "description": "book description",
 "author": {
 “firstname”: “Mark”,
 “lastname”: “Green”,
 }
 } REQUEST EMBEDDED AUTHOR
  94. embedded document $ curl example.com/works/<id>?embedded={“author”: 1} {
 "title": "Book Title",


    "description": "book description",
 "author": {
 “firstname”: “Mark”,
 “lastname”: “Green”,
 }
 } # embedding is configurable on per-field basis and # can be pre-set by API maintainer EMBEDDED DOCUMENT
  95. bulk inserts insert multiple documents with a single request

  96. request $ curl -d ‘ [ { "firstname": "barack", "lastname":

    “obama" }, { "firstname": "mitt", "lastname": “romney” } ]' -H 'Content-Type: application/json’ <url>
  97. response [ { "_status": "OK", "_updated": "Thu, 22 Nov 2012

    15:22:27 GMT", "_id": "50ae43339fa12500024def5b", "_etag": "749093d334ebd05cf7f2b7dbfb7868605578db2c" "_links": {"self": {"href": “<url>”, "title": "person"}} }, { "_status": "OK", "_updated": "Thu, 22 Nov 2012 15:22:27 GMT", "_id": "50ae43339fa12500024def5c", "_etag": "62d356f623c7d9dc864ffa5facc47dced4ba6907" "_links": {"self": {"href": “<url>", "title": "person"}} } ] COHERENCE MODE OFF: ONLY META FIELDS ARE RETURNED
  98. response [ { "_status": "OK", "_updated": "Thu, 22 Nov 2012

    15:22:27 GMT", "_id": "50ae43339fa12500024def5b", "_etag": "749093d334ebd05cf7f2b7dbfb7868605578db2c" "_links": {"self": {"href": “<url>”, "title": “person”}}, "firstname": "barack", "lastname": "obama", }, { "_status": "OK", "_updated": "Thu, 22 Nov 2012 15:22:27 GMT", "_id": "50ae43339fa12500024def5c", "_etag": "62d356f623c7d9dc864ffa5facc47dced4ba6907" "_links": {"self": {"href": “<url>", "title": "person"}} "firstname": "mitt", "lastname": "romney", } ] COHERENCE MODE ON: ALL FIELDS RETURNED
  99. document versioning ?version=3 ?version=all ?version=diffs

  100. soft deletes preserve deleted documents and retrieve them with a

    simple ?show_deleted
  101. file storage files are stored in GridFS by default; customizable

    for S3, file system, etc.
  102. data validation powerful and extensible data validation powered by Cerberus

  103. rich validation grammar referentrial integrity / unique values / defaults

    / regex / etc.
  104. custom data types create your own data types to validate

    against
  105. custom validation logic extended the validation system to cater for

    specific use cases
  106. multi database serve endpoints and/or users from dedicated mongos

  107. index maintenance define sets of resource indexes to be (re)created

    at launch supports sparse, geo2d and background indexes
  108. event hooks plug custom actions in the request/response cycle

  109. run.py from eve import Eve app = Eve() def percona(resource,

    response): documents = response['_items'] for document in documents: document['percona'] = 'is so cool!' if __name__ == '__main__': app.on_fetched_resource += percona_live app.run() CALLBACK FUNCTION
  110. run.py from eve import Eve app = Eve() def percona(resource,

    response): documents = response['_items'] for document in documents: document['percona'] = 'is so cool!' if __name__ == '__main__': app.on_fetched_resource += percona_live app.run() LOOP ON ALL DOCUMENTS
  111. run.py from eve import Eve app = Eve() def percona(resource,

    response): documents = response['_items'] for document in documents: document['percona'] = 'is so cool!' if __name__ == '__main__': app.on_fetched_resource += percona_live app.run() INJIECT FIELD
  112. run.py from eve import Eve app = Eve() def percona(resource,

    response): documents = response['_items'] for document in documents: document['percona'] = 'is so cool!' if __name__ == '__main__': app.on_fetched_resource += percona_live app.run() ATTACH CALLBACK TO EVENT HOOK
  113. rate limiting powered

  114. settings.py # Rate limit on GET requests: RATE_LIMIT_GET = (1,

    60) ONE REQUEST PER MINUTE (CLIENT)
  115. $ curl -i <url> HTTP/1.1 200 OK X-RateLimit-Limit: 1 X-RateLimit-Remaining:

    0 X-RateLimit-Reset: 1390486659 rate limited request CURRENT LIMIT
  116. $ curl -i <url> HTTP/1.1 200 OK X-RateLimit-Limit: 1 X-RateLimit-Remaining:

    0 X-RateLimit-Reset: 1390486659 rate limited request REMAINING
  117. $ curl -i <url> HTTP/1.1 200 OK X-RateLimit-Limit: 1 X-RateLimit-Remaining:

    0 X-RateLimit-Reset: 1390486659 rate limited request TIME TO RESET
  118. $ curl -i <url> HTTP/1.1 429 TOO MANY REQUESTS rate

    limited request OUCH!
  119. operations log log all operations and eventually expose a dedicated

    endpoint
  120. HATEOAS Hypermedia As Engine Of The Application State

  121. XML support $ curl -H ”Accept: application/xml” -i http://example.com/people <resource

    href="people" title="people" > <link rel="parent" href="/" title="home" /> <resource href="people/55fc149138345b0880f07e3d" title="person" > <_created>Fri, 18 Sep 2015 13:41:37 GMT</_created> <_etag>5d057712ce792ebb4100b96aa98bfe9b6693c07b</_etag> <_id>55fc149138345b0880f07e3d</_id> <_updated>Fri, 18 Sep 2015 13:41:37 GMT</_updated> <email>johndoe@gmail.com</email> <name>john</name> </resource> </resource>
  122. conditional requests allow clients to only request non-cached content

  123. If-Modified-Since If-Modified-Since: Wed, 05 Dec 2012 09:53:07 GMT ONLY RETURN

    DOCUMENT IF MODIFIED SINCE
  124. If-None-Match If-None-Match:1234567890123456789012345678901234567890 > ONLY RETURN DOCUMENT ETAG CHANGED

  125. data integrity and concurrency no overwriting documents with obsolete versions

  126. missing ETag # fails, as there is no ETag included

    with request $ curl \ -X PATCH \ -i http://example.com/people/521d6840c437dc0002d1203c \ -H "Content-Type: application/json" \ -d '{"name": “ronald"}' HTTP/1.1 403 FORBIDDEN NO ETAG REJECTED
  127. ETag mismatch # fails, as ETag does not match with

    server $ curl \ -X PATCH \ -i http://example.com/people/521d6840c437dc0002d1203c \ -H "If-Match: 1234567890123456789012345678901234567890" \ -H "Content-Type: application/json” \ -d '{"firstname": "ronald"}' HTTP/1.1 412 PRECONDITION FAILED ETAG MISMATCH REJECTED
  128. valid ETag # success at last! ETag matches with server

    $ curl \ -X PATCH \ -i http://example.com/people/50adfa4038345b1049c88a37 \ -H "If-Match: 80b81f314712932a4d4ea75ab0b76a4eea613012" \ -H "Content-Type: application/json" \ -d '{"firstname": "ronald"}' HTTP/1.1 200 OK # Like most of features, ETags can be disabled. ETAG MATCH ACCEPTED
  129. custom data layers build your own data layer

  130. authentication and authorization basic / token / hmac / BYO

    / OAuth2 / you name it
  131. and (a lot) more CORS, cache control, API versioning, JOSNP,

    Etc.
  132. vibrant community 90+ contributors / 350+ forks / 2500+ github

    stargazers
  133. Eve-Docs automatic documentation for Eve APIs in both HTML and

    JSON CHARLES FLYNN
  134. Eve-Docs

  135. Eve-Elastic Elasticsearch data layer for your Eve-powered API PETR JASEK

  136. Eve-SQLAlchemy SQL data layer for Eve-powered APIs PETR JASEK

  137. Eve-Mongoengine enables mongoengines data models to be used as Eve

    schema STANISLAV HELLER
  138. Eve.NET HTTP and REST client for Eve-powered APIs PETR JASEK

  139. Eve-OAuth2 leverage Flask-Sentinel to protect your API endpoints with OAuth2

    THOMAS SILEO
  140. REST Layer “golang REST API framework heavily inspired by Python

    Eve” THOMAS SILEO
  141. Goal # 4 achieved easy to setup, launch and scale

    up; also a good fit for microservices infrastracture
  142. A look back to initial draft Clients “Cloud” Database RESTful

    Web API API iOS Android Website Desktop Client ? ?
  143. Clients Multiple MongoDBs Database Adam eve instances API iOS Android

    Website Desktop Client what we have in production
  144. stories from the trenches #1. when too much (magic) is

    too much #2. sometimes you don’t want to debug #3. so how do I login into this thing?
  145. Take Aways

  146. Enhance MongoDB with powerful features on top of native engine

    validation, document embedding (joins), referential integrity, document versioning, transformations, rate limiting, etc.
  147. Consider the REST layer as an ideal data access layer

    the story of pymongo 3.0 breaking changes mongo or sql or elastic or …
  148. Consider the REST layer as an ideal data access layer

    the story of Adam dashboards
  149. Consider Microservices leverage Eve features create a network of isolated

    yet standardized services 
 each service has a dedicated role runs as an eve instance, with its own configuration has its own database(s) callbacks route traffic between services
  150. Clients User-reserved MongoDBs eve-multidb Data Auth eve-oauth2 (flask-sentinel) API iOS

    Android Website Desktop Client Adam 1 Adam eve instance Redis auth tokens, rate limiting Auth/Users MongoDB
  151. Clients service- reserved MongoDBs Data Auth eve-oauth2, flask-sentinel API iOS

    Android Website Desktop Client Adam 2 Adam eve instance Redis auth tokens, rate limiting Services eve instances Auth/Users MongoDB very simplified
  152. thanks nicolaiarocci python-eve.org eve