Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Save The World And Money With MongoDB Atlas Data Lake

Joe Karlsson
November 02, 2020

Save The World And Money With MongoDB Atlas Data Lake

Data centers are expensive. It turns out that this is not very great for the environment. By 2040, storing digital data is set to create 14% of the world’s green house emissions. As a developer you probably work with a lot of data. Your clusters balloon and become more expensive every day. Now is the time to be a hero, save the world and your wallet.

In this live coding session, I will show you how to archive your cold MongoDB data automatically to an AWS S3 bucket using Serverless Triggers. I will also demonstrate how to keep querying this archived data using MongoDB Data Lake with zero downtime.

You walk away from this session with a clear understanding of data lakes, their features and capabilities. Join this session and be equipped to save the world.

Joe Karlsson

November 02, 2020
Tweet

More Decks by Joe Karlsson

Other Decks in Programming

Transcript

  1. Save The World with MongoDB Data Lake Joe Karlsson |

    Developer Advocate | @JoeKarlsson1 joekarlsson.dev/MDB-Data-Lake
  2. { name: “Joe Karlsson”, company: “MongoDB”, title: [ “Developer Advocate”,

    “Software Engineer” ], } twitter: “@JoeKarlsson1”, twitch: “joe_karlsson”, tiktok: “joekarlsson”, website: “joekarlsson.com”, opinions: “my own”, links: “joekarlsson.dev/MDB-Data-Lake” joekarlsson.dev/iot-kitty-bf04b joekarlsson.dev/MDB-Data-Lake
  3. @JoeKarlsson1 Agenda Online Archive Demo: Archive with $out Demo: Archive

    with a Realm function MongoDB Atlas Data Lake joekarlsson.dev/MDB-Data-Lake
  4. @JoeKarlsson1 Atlas Data Lake - Features and Benefits Query your

    S3 and MongoDB Atlas data in-place and in its native format using the MongoDB Query Language (MQL). Work with rich data easily & intuitively Leverage a serverless & scalable query service Easy to use with your favorite tools Integrated with the MongoDB Cloud Platform Eliminate cost & complexity of data movement
  5. pipeline_s3 = [ {'$match': {'date': {'$gte': date_start, '$lt': date_stop}}}, {

    '$out': { 's3': { 'bucket': 'cold-data-mongodb', 'region': 'eu-west-1', 'filename': date_start.isoformat('T', 'milliseconds') + 'Z-' + date_stop.isoformat('T', 'milliseconds') + 'Z', 'format': {'name': 'json', 'maxFileSize': '200MiB'} } } } ] iot_data_lake.aggregate(pipeline_s3) Archiving with pymongo & $out joekarlsson.dev/ MDB-Data-Lake
  6. To Sum Up We can archive in S3 Saving the

    World Saving lot of money We have access to ALL the data
  7. Want $100 in FREE MongoDB Atlas credits? Use code JoeK100

    joekarlsson.dev/free-atlas-credits joekarlsson.dev/free-atlas-credits
  8. @JoeKarlsson1 Additional Resources joekarlsson.dev/ MDB-Data-Lake [Docs] MongoDB Atlas Data Lake

    Documentation: https://docs.mongodb.com/datalake/ [Docs] Archiving a MongoDB Cluster: https://docs.atlas.mongodb.com/online-archive/manage-online- archive/ [DevHub Post] MongoDB Data Lake Setup Tutorial: https://developer.mongodb.com/how-to/atlas-data-lake-setup [GitHub] Save The World And Money With MongoDB Data Lake: https://github.com/JoeKarlsson/mongodb-datalake-save-the-world [GitHub] MongoDB IoT Sample Data Generator: https://github.com/joekarlsson/IoT-generator-mongodb
  9. { name: “Joe Karlsson”, company: “MongoDB”, title: [ “Developer Advocate”,

    “Software Engineer” ], } twitter: “@JoeKarlsson1”, twitch: “joe_karlsson”, tiktok: “joekarlsson”, website: “joekarlsson.com”, links: “joekarlsson.dev/MDB-Data-Lake” joekarlsson.dev/ MDB-Data-Lake joekarlsson.dev/MDB-Data-Lake
  10. { name: “Joe Karlsson”, company: “MongoDB”, title: [ “Developer Advocate”,

    “Software Engineer” ], } twitter: “@JoeKarlsson1”, twitch: “joe_karlsson”, tiktok: “joekarlsson”, website: “joekarlsson.com”, links: “joekarlsson.dev/MDB-Data-Lake” joekarlsson.dev/ MDB-Data-Lake joekarlsson.dev/MDB-Data-Lake