Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB @ CustomInk: Adoption, Operations, & Co...

Nathen Harvey
December 09, 2011

MongoDB @ CustomInk: Adoption, Operations, & Community

Presentation given at MongoSV, 2011.

Nathen Harvey

December 09, 2011
Tweet

More Decks by Nathen Harvey

Other Decks in Technology

Transcript

  1. Hello! •  Nathen Harvey –  Web Operations at CustomInk – 

    MongoDB Master –  [email protected] –  @nathenharvey •  Organize MongoDC Meetups –  @MongoDC –  http://www.meetup.com/Washington-DC- MongoDB-Users-Group/ –  250+ Members –  Stop by next time you’re in the DC area! @nathenharvey 2
  2. Thanks for attending! •  Here’s what we’ll cover –  Introduction

    to CustomInk –  Why we chose MongoDB –  Adoption challenges –  How we use MongoDB –  How we deploy and monitor MongoDB –  Tips for organizing a monthly MongoDB User Group •  Please interrupt with questions! @nathenharvey 3
  3. CustomInk Technology •  Started as a Java shop •  2006

    – Began migrating applications to Ruby on Rails •  Primarily running in a datacenter •  Have 30+ applications in production –  ECommerce –  Order Fulfillment –  Supply Chain Management –  CRM @nathenharvey 7
  4. Oracle •  Was the “right” choice in 2000 •  Primary

    data store •  Works well in production but… –  Difficult to run on developers’ laptops –  Expensive –  Complex –  Not very portable @nathenharvey 9
  5. MongoDB and MySQL •  Easy to run on developers’ laptops

    •  Inexpensive •  Not overly complex •  Very portable –  Easy to run locally, in the data center, or in the cloud @nathenharvey 10
  6. Why we Chose MongoDB •  Did I mention we’re a

    Rails shop? Couple of stereotypes about Rails developers: –  Prefer new technology –  Prefer agile, iterative development practices –  Prefer open-source, community-driven products –  Believe “work” should be fun @nathenharvey 11
  7. Why we Chose MongoDB •  New technology •  Easy for

    developers to get started •  Easy for developers to run locally •  Easy to go from dev to test to prod •  Easy to make quick changes •  Easy to operate in production •  Makes / keeps developers happy @nathenharvey 12
  8. Introducing MongoDB •  Developers very excited about MongoDB •  Managers

    hesitant to adopt the new technology @nathenharvey 13
  9. No experience with MongoDB •  Our operations team doesn’t have

    any experience with running this in production •  It’s cool, they’re smart and the developers are more than willing to help out @nathenharvey 14
  10. Unstable Product •  It seems a new version of MongoDB

    is released every few months •  We deploy application changes frequently so it’s inline with our own development practices •  Upgrading MongoDB != Upgrading Oracle –  Zero downtime upgrades with MongoDB @nathenharvey 15
  11. Unproven Solution •  Nobody I know is using MongoDB • 

    Many sites you know are using it! @nathenharvey 16
  12. Unnecessary •  I’m not sure how introducing a new database

    technology makes our customers’ experience better •  Application development becomes faster and easier @nathenharvey 17
  13. Few experts •  We have access to experts in Oracle

    and MySQL. How will we find MongoDB experts? •  There’s a strong community around MongoDB •  10gen offers professional support •  We’ll get involved in the local MongoDB community @nathenharvey 18
  14. Reporting •  We rely on our metrics, data, and reports.

    Will our reporting team be able to work with MongoDB? •  Look, crickets can tap dance! @nathenharvey 19
  15. Strategy for Adopting MongoDB •  Start small, with something not

    in the order flow •  Gain experience •  Build on our success @nathenharvey 20
  16. How we Use MongoDB Today •  3 classes of applications

    –  Logging •  Application logs •  Website activity logging –  Content Management •  Product Catalog –  Supply Chain Management •  Printer / Vendor Network •  Supplier Network @nathenharvey 21
  17. Logging – Why MongoDB? •  Introduce MongoDB to the company

    •  Gain development & operational experience •  Textbook use case •  Stay out of order flow •  Contribute to the community •  Safe place to start @nathenharvey 22
  18. Central Logger - Safe •  Application logs aren’t in the

    order flow •  Applications continue to write to local log files @nathenharvey 24
  19. Central Logger - Experience •  Uses the Ruby driver for

    MongoDB •  Capped Collection •  Replica Set @nathenharvey 25
  20. Central Logger - Community •  Forked mongo_db_logger from Phil Burrows

    •  Open sourced both components –  https://github.com/customink/central_logger/ –  https://github.com/customink/central_log_viewer @nathenharvey 26
  21. Logging – Other applications •  Capture and report client-side javascript

    events •  Capture and report on artwork usage @nathenharvey 28
  22. Product Catalog •  Managing our product catalog was becoming increasingly

    difficult •  Merchandising our products on the website has some significant limitations •  Data structure is rigid and has real issues @nathenharvey 30
  23. Holy Data! •  Our product catalog is full of holy

    data –  Relational structure –  Poor schema decisions @nathenharvey 31
  24. Holy Data – Fields with Special Powers •  DTB_DUE • 

    DTE_LOG •  DTM_LOG •  DT_EXTRA8 •  EXTRA1 •  PSIZEN •  V3_FILENAME •  X4 •  X7 •  X_7 •  X_10 @nathenharvey 33
  25. Product Catalog •  Time for a re-write! •  Data suggests

    this is a good fit for documents •  Product imagery could give us experience with GridFS •  Rapid prototyping and iterations required @nathenharvey 34
  26. Mental Exercise •  Goal: Show ordered list of sizes for

    each product Ultra Cotton T S, M, L, XL, XXL @nathenharvey 35
  27. Ultra Cotton T S, M, L, XL, XXL •  Legacy

    implementation •  select from product where id = 4600; @nathenharvey 36 ID NAME SIZE_1 SIZE_2 SIZE_3 SIZE_4 SIZE_5 4600 Ultra Cotton T S M L XL XXL
  28. Ultra Cotton T S, M, L, XL, XXL •  select

    p.name, s.name from products p, sizes s, product_sizes ps where p.id = ps.product_id and s.id = ps.size_id and p.id = 4600 order by ps.priority; @nathenharvey 37 ID NAME 4600 Ultra Cotton T ID NAME 33 S 34 M 35 L 36 XL 37 XXL PRODUCT_ID SIZE_ID PRIORITY 4600 33 1 4600 34 2 4600 35 3 4600 36 4 4600 37 5
  29. Ultra Cotton T S, M, L, XL, XXL name: “Ultra

    Cotton T”, sizes: [“S”, “M”, “L”, “XL”, “XXL”], id: 4600 db.products.save({name: "Gildan Ultra Cotton-T", sizes: ["S", "M", "L", "XL", "XXL"], id: 4600}) @nathenharvey 38
  30. Application Architecture •  Nginx, Rails, and MongoDB •  Utilizes GridFS

    for storing files (product imagery) •  Uses Mongoid as the ODM @nathenharvey 39
  31. Participants in our Supply Chain •  Customers •  Blank apparel

    suppliers •  Printers (screen and digital) •  Production material suppliers •  Packaging suppliers •  Delivery providers @nathenharvey 40
  32. Supplier Service •  Supplier information •  Blanks inventory data (availability

    & pricing) •  Automates ordering of blank shirts •  Ordering logistics @nathenharvey 41
  33. How we Use MongoDB Today •  3 classes of applications

    –  Logging –  Content Management –  Supply Chain Management @nathenharvey 44
  34. How We Deploy MongoDB •  Replica sets –  2 nodes

    –  1 arbiter •  No need for sharding (yet) •  Capped Collection for the Central Logger •  We manage our infrastructure with Chef @nathenharvey 45
  35. Chef – Automated Deployment •  Chef is an open source

    systems integration framework built to bring the benefits of configuration management to your entire infrastructure •  Infrastructure as code •  We use Chef to manage our infrastructure and deploy MongoDB @nathenharvey 46
  36. Monitor-driven Deployment •  When deploying a new server, we start

    by creating the Nagios checks •  These are basically unit tests •  Server is made live when all monitors are green @nathenharvey 48
  37. How We Monitor MongoDB •  On-host monitoring with Monit • 

    In data-center monitoring with Nagios •  “External” monitoring with MMS @nathenharvey 49
  38. Lessons Learned – Schema Design •  Schema free != design

    free •  Should you create this as an embedded document? •  4 talks today on schema design @nathenharvey 55
  39. Lessons Learned – Map Reduce •  Unless you’re sharding, map

    reduce is probably slower and more complex than what you’re used to •  Check out Chris Westin’s talk on the new aggregation framework @nathenharvey 56
  40. Lessons Learned - GridFS •  Use nginx-gridfs module when serving

    up GridFS files •  https://github.com/mdirolf/nginx-gridfs @nathenharvey 57
  41. Lessons Learned - ODM •  Experience with MongoMapper & Mongoid

    •  BSON to Object to JSON –  Lots of extra object allocation & garbage collection •  Use the native drivers whenever you can @nathenharvey 58
  42. Lessons Pending - Reporting •  Reporting is our biggest outstanding

    challenge •  We use a number of different reporting tools but Crystal Reports is our “enterprise” solution •  What strategies are you employing? @nathenharvey 59
  43. Organizing a Monthly User Group •  Why organize a user

    group? –  Get to know people from other organizations –  Hear about how others are solving similar problems –  Expose your company, office, and expertise to other local developers –  Increase your company’s profile in local tech circles @nathenharvey 60
  44. Monthly User Group Tips •  Register a twitter account – 

    Ask attendees for their twitter account –  Create lists of attendees –  Tweet about events and news @nathenharvey 63
  45. Monthly User Group Tips •  Give away FREE stuff – 

    Food –  Beer –  Quick reference cards –  Coffee mugs @nathenharvey 64
  46. MongoDC Meetup – Join Us! •  Please follow @mongodc and

    let us know next time you’re in the DC area! •  http://www.meetup.com/Washington-DC- MongoDB-Users-Group @nathenharvey 71
  47. Thank You! •  What questions do you have? •  Nathen

    Harvey –  Web Operations at CustomInk –  MongoDB Master –  [email protected] –  @nathenharvey •  Organize MongoDC Meetups –  @MongoDC –  http://www.meetup.com/Washington-DC-MongoDB-Users- Group/ –  250+ Members –  Stop by next time you’re in the DC area! @nathenharvey 72