Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to Mongdb

Mike Frampton
September 12, 2013

An introduction to Mongdb

A short introduction to mongodb, what is it
and how does it work ? How can it be used
with Hadoop to process big data ?

Mike Frampton

September 12, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. MongoDB • What is it ? • Features • Tools

    • Use with Hadoop • Hadoop Tools www.semtech-solutions.co.nz [email protected]
  2. MongoDB – What is it ? • Document oriented NoSql

    database • BSON schema data format ( Binary JSON ) • Released as open source / free • Can be used as a distributed database • Has load balancing • Has replication • Written in C++ • Licensed via Apache www.semtech-solutions.co.nz [email protected]
  3. MongoDB – Features • Queries – By field – By

    regular expression – User defined java script functions – By range • Indexes – Primary and secondary – Any document field • Replication – Master can replicate to multiple slaves www.semtech-solutions.co.nz [email protected]
  4. MongoDB – Features • Load balancing – Data split across

    multple shards – DB scales using shards – New machines can be added to running database • Map reduce can be used for aggregation • File storage via GridFS – Load balanced file system – File system with replication – Functions available for file manipulation www.semtech-solutions.co.nz [email protected]
  5. MongoDB – Tools • Mongo – a db access shell

    and admin tool • Mongostat – a status tool similar to vmstat • Mongotop – top processes like Unix top command • Mongosniff – low level traffic sniffing • Mongoimport – import JSON, CSV, TSV plus others • Mongoexport – export tool ( as import ) • Mongodump – dump database contents • Mongostore – reload database dumps www.semtech-solutions.co.nz [email protected]
  6. MongoDB – With Hadoop • Hadoop connector available from github

    • Allows Hadoop I/O • Compiles with SBT build tool • Supports Hadoop – 0.20/0.20.x – 1.0/1.0.x – 1.1/1.1.x – 0.21/0.21.x – CDH3 – CDH4 www.semtech-solutions.co.nz [email protected]
  7. MongoDB – Attributes The image on the left shows how

    Hadoop and its tools are used with MongoDB via a connector. The image on the right shows MongoDB attributes. www.semtech-solutions.co.nz [email protected]
  8. MongoDB – Hadoop Tools • The Hadoop connector supports –

    Map Reduce – Pig – Hadoop streaming – Flume – Hive – Hive BSON file access • MongoDB can use HDFS for storage www.semtech-solutions.co.nz [email protected]
  9. MongoDB – Architecture • A db server – has many

    databases • A database – Has many collections • A collection – Has many documents www.semtech-solutions.co.nz [email protected]
  10. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems