Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learning by doing - running a mongoDB, the hard...

mongodb
October 13, 2011

Learning by doing - running a mongoDB, the hard way - Sandro Grundmann

Mongo Munich 2011

if you could install it - you could operate it, everybody starts from scratch, so let's run it (24x7). And yes the 500 GB MongoDB runs now. No, it... was'nt as easy as i thought, i had a lot to learn and i'm still learning. It was nice to see the MongoDB Versions developping and becoming the stable Enterprise DB that i promised right in time. And i will tell you the whole story ;-)

mongodb

October 13, 2011
Tweet

More Decks by mongodb

Other Decks in Technology

Transcript

  1. Learning by doing - running a mongoDB, the hard way

    10.10.2011 – 10gen Mongo Munich, Sandro Grundmann
  2. The unbelievable Machine Company 2 Agenda - About myself and

    *um - mongoDB (1) Install and Run (2) GridFS (3) replica sets (4) Scalability and performance (5) Monitoring  What’s left ? - Conclusion/Recommendation 10.10.2011
  3. About myself • Dipl. Ing. for Information Technology (BA) •

    IT since 2002 – IT Service & Support for Volkswagen AG IT, Wolfsburg, 5 years – Systems Engineer, Web Operations, Aperto AG Internet Agency, Berlin, 2 years – System Architect Web Operations at *um since 2010 • Speaker at Chemnitzer Linux Tage • Trainer for – Linux basics – Linux Professional Institute Certificate (LPIC-1) The unbelievable Machine Company 3 10.10.2011
  4. About The unbelievable Machine Company • Managed hosting provider for

    business customers relying on their web infrastructure • Headquarters and data center in Berlin • Tier4 data center, >500m2 • Team of 20 colleagues • Founded in 2008 by Ravin Mehta • More than 80 business customers and about 300 webservers – Focused on operating high performance websites and running new web technologies The unbelievable Machine Company 4 10.10.2011
  5. Website Ready for our Workflow - *umServiceDesk The unbelievable Machine

    Company 7 Monitoring / Reporting 24x7 ServiceDesk measures internal + external Metric Customer Ticket System *umServiceDesk, 24x7 Change Management Incident Management Problem Management 10.10.2011
  6. *um • mongoDB – requirements (1) Install and Run (2)

    GridFS (3) replica sets (4) Scalability and performance (5) Monitoring  What’s left ? Recommendation The unbelievable Machine Company 8 10.10.2011
  7. (Lesson 1) Install and Run • Don’t use mongoDB package

    from distribution – Ubuntu 10.04 LTS: mongoDB 1.2.2 – CentOS/RHEL 5.6: none • Use 10gen repository – Version 2 (in our setup Version 1.8) – Ubuntu • http://www.mongoDB.org/display/DOCS/Ubuntu+and+Debian+pack ages – Fedora/CentOS/RHEL • http://www.mongoDB.org/display/DOCS/CentOS+and+Fedora+Pac kages The unbelievable Machine Company 10 10.10.2011
  8. (Lesson 2) GridFS • When using GridFS, don’t use a

    webserver without raw access support - e.a. Apache • Instead use one of these two – Valid webservers, native GridFS access via plugin • NGINX • Lighttpd • Alternatives – Apache is not directly supported • But runable via requests to another NGINX • Support GridFS via plugin in your application – Our client choose a custom webserver: Tornado (Python) via application The unbelievable Machine Company 11 10.10.2011
  9. (Lesson 3) replica sets • Learning about node types, about

    server states The unbelievable Machine Company 12 Standard • Primary • Secondary • Quorum Passive • Secondary • Quorum Arbiter • Quorum Hidden • Since mongoDB 1.8 • Hold data, not visible directly • Quorum ? Primary • Master Secondary • Slave Recovering • Getting back in sync • After sync will become slave 10.10.2011
  10. (Lesson 3) Used replica set scheme The unbelievable Machine Company

    13 • 2 data node replica set • added 3rd node for quorum • Types of nodes • Standard for 2 data nodes (every node can be used as master) • Arbiter for quorum node (no data, never master) • Config example for testing: config = {_id: ‚testrepl', members: [ {_id: 0, host: 'localhost:27017'}, {_id: 1, host: 'localhost:27018'}, {_id: 2, host: 'localhost:27019', arbiterOnly: true}] } 10.10.2011
  11. • Design collection architecture from start – Not every collection

    needs a replica set – Use various instances of mongoDB and replica sets • Plan with your client/developer the data growth – E.a. most setups starts with a bunch of data on a virtual machine, but when it became much more bigger (we have a 500 GB now) mongoDB Instance on dedicated machines is strongly recommend • OS level settings – Filesystems: mongoDB benefits from xfs and ext4 features – RAID level: we recommend and use RAID10 The unbelievable Machine Company 14 (Lesson 4) Scalability and performance 10.10.2011
  12. (Lesson 4) Scalability and performance • Maintenance work – Use

    mongoDB defragmentation tools periodically • V1.8: repair (blocks your instance, need twice data space capacity) • V2.0: compact job (runs online) • Backup/restore – Think about your backup/restore time, use journal and hidden node – Test your restore – Think about indexes • Scaling options – Vertical – use more powerful hardware – Horizontal • use data slaves for read-ops • use shards for write-ops The unbelievable Machine Company 15 10.10.2011
  13. (Lesson 5) 24x7 running & monitoring • For long term

    key performance indicators – munin with 10gen munin plugin – Standard munin plugins for server health The unbelievable Machine Company 16 Notice: you can also use the new 10gen MMS 10.10.2011
  14. (Lesson 5) 24x7 running & monitoring • For health monitoring

    and alerting – Nagios with our own scripts – Currently we have no alerting for replica sets The unbelievable Machine Company 17 10.10.2011
  15. (Lesson 5) 24x7 running & monitoring • For troubleshooting –

    mongostat (on client, working at a servicedesk issue) – DBProfiler/SyslogNG for centralized logs • % Index misses • Authentication issues • Page faults The unbelievable Machine Company 18 10.10.2011
  16. What’s left? • Upgrade to mongoDB 2 • Auto Sharding

    – Understanding MapReduce – Building a MapReduce scheme for customer data • Better backup/restore configuration • Improve monitoring – replica sets – monitor replication lag – Use native 10gen nagios plugin The unbelievable Machine Company 19 10.10.2011
  17. Our conclusion/recommandation • mongoDB is no rocket science • Use

    the last stable version (10gen repository) • Integrate it properly into your Monitoring • Make a failover test/restore test • Don’t recommend NoSQL as a SQL replacement • The strength of NoSQL is scalability and lightweight, if you need this, use it! • LAMP is no longer the standard, don’t be afraid to try something new The unbelievable Machine Company 20 10.10.2011
  18. Comparison to other NoSQL products we use • mongoDB does

    fit well – More powerful replication settings than CouchDB – Horizontal scaling architectures learned from MySQL apply – Often a better solution than MySQL with Memcached Setup – More lightweight and easier setup than hadoop The unbelievable Machine Company 21 10.10.2011
  19. Sandro Grundmann [email protected] www.unbelievable-machine.com Want to join us? Yes, we

    hire! http://www.unbelievable-machine.com/um/jobs/ Slides available at: http://www.unbelievable-machine.com/um/blog/ The unbelievable Machine Company 22 Thank you! 10.10.2011