Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling and Flexibility: Two case studies on le...

mongodb
January 03, 2012

Scaling and Flexibility: Two case studies on leveraging the versatility of MongoDB - Erik Kellener, UnitedFuture

MongoSV 2011

Two very different MongoDB implementations; both with substantial outcomes. 1) Scaling MongoDB to calculate football team picks, and ranking in realtime, as the results unfold. 2) Leveraging MongoDB's flexible schema to reduce complexity and speed performance for XML data feeds. In both of these examples, we will review the architecture deployed on EC2, and discuss how MongoDB was a crucial component of the optimal solution.

mongodb

January 03, 2012
Tweet

More Decks by mongodb

Other Decks in Technology

Transcript

  1. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 2 :: Who are we? Tuesday, January 3, 2012
  2. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 2 :: Who are we? UNITED FUTURE CREATES EXPERIENCES THAT COMPEL CONSUMERS TO ACT AND BELIEVE. Tuesday, January 3, 2012
  3. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 2 :: Who are we? WEB MOBILE BROADCAST DIGITAL RETAIL SOCIAL MEDIA Tuesday, January 3, 2012
  4. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 3 CGD FACEBOOK APP YOKOHAMA TIRE SITE •Football pool - tied to CGD •Enlist & challenge friends •Rank with celebrities :: Case Studies Tuesday, January 3, 2012
  5. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 3 CGD FACEBOOK APP •Football pool - tied to CGD •Enlist & challenge friends •Rank with celebrities :: Case Studies Tuesday, January 3, 2012
  6. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 4 :: CGD Facebook App Tuesday, January 3, 2012
  7. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 4 :: CGD Facebook App Each week, users enter their team picks for Saturday games. Tuesday, January 3, 2012
  8. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 4 :: CGD Facebook App Each week, users enter their team picks for Saturday games. Stats. reported by week and by season Tuesday, January 3, 2012
  9. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 4 :: CGD Facebook App Ranking compared and reported by : •FB friends •Celebrities •Custom Groups (“Bob’s Football Pool”) •Overall participants. Each week, users enter their team picks for Saturday games. Stats. reported by week and by season Tuesday, January 3, 2012
  10. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 5 ::CGD :: Distribution Stack Tuesday, January 3, 2012
  11. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 5 ::CGD :: Distribution Stack EC2 (LG) ELB PHP CodeIgniter Doctrine APC (PECL) MongoDB 1 - Mongod AWS Architecture Cherokee Linux EBS EC2 (LG) Cherokee Linux EBS EC2 (LG) Auto Scale (max 9) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) Application Systems Replica Set Arbiter 2 - Mongod Arbiter 3 - Mongod 1 2 3 Tuesday, January 3, 2012
  12. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 5 ::CGD :: Distribution Stack EC2 (LG) ELB PHP CodeIgniter Doctrine APC (PECL) MongoDB 1 - Mongod AWS Architecture Cherokee Linux EBS EC2 (LG) Cherokee Linux EBS EC2 (LG) Auto Scale (max 9) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) Application Systems Replica Set Arbiter 2 - Mongod Arbiter 3 - Mongod 1 2 3 Tuesday, January 3, 2012
  13. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 5 ::CGD :: Distribution Stack EC2 (LG) ELB PHP CodeIgniter Doctrine APC (PECL) MongoDB 1 - Mongod AWS Architecture Cherokee Linux EBS EC2 (LG) Cherokee Linux EBS EC2 (LG) Auto Scale (max 9) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) MongoDB Linux EBS EC2 (LG) Application Systems Replica Set Arbiter 2 - Mongod Arbiter 3 - Mongod 1 2 3 Tuesday, January 3, 2012
  14. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 6 :: CGD :: ODM :: Doctrine •Doctrine : Object Data Mapper (ODM) •Great abstraction layer. •Specification doesn’t require separate file (e.g. yaml,xml) •Transactional write-behind •Powerful persistence behaviors. (prepersist, postpersist) Example: Document Specification <?php namespace Documents; use Doctrine\Common\Collections\ArrayCollection; /** @Document(db="saturdayselections", collection="users") */ class User { /** @Id */ private $id; /** @Date */ private $created; /** @String @Index(order="asc") */ private $name; /** @EmbedOne(targetDocument="FacebookUser") */ private $facebook; /** @EmbedMany(targetDocument="UserPick") */ private $picks; /** @EmbedOne(targetDocument="OverallResult") */ private $overall; /** @EmbedMany(targetDocument="WeeklyResult") */ private $weekly; public function __construct($name) { $this->created = new \DateTime(); $this->picks = array(); } public function set_facebook(FacebookUser $fb) { $this->facebook = $fb; } .. Example: Doctrine Query ... $ids = array(); $result = $this->doctrine->createQuery('Documents\User') ->select('id') ->field('facebook.id')->in((array) $facebook_ids) ->limit($limit) ->hydrate(FALSE) ->getQuery() ->execute() ->toArray(); ... Example: Corresponding MongoDB query. ... db.user.find({“facebook.id” : {“$in” : [“102929”,”20293903”]}}).limit(5); ... Tuesday, January 3, 2012
  15. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 7 { "_id": { "$oid": "4e6aca9e4c70d78774000001" }, "facebook": { "id": "1411140872", "name": "Bob Jones"}, "is_celeb": false, "name": "Bob Jones", "overall": { "correct": 6, "incorrect": 2, "percentage": 0, "points": 6, "total": 8 }, "picks": [ { "week": 2, "game_id": "4e66b34d60e217b505000003", "team": "Notre Dame", "outcome": 0, "points": 0, "created": "Fri, 09 Sep 2011 19:27:49 GMT -07:00" }, { "created": "Fri, 09 Sep 2011 19:27:49 GMT -07:00", "game_id": "4e6aa7c24c70d78c6f000001", "outcome": 1, "points": 1, "team": "Auburn", "week": 2 }, ... } ], "weekly": { "2": { "correct": 6, "incorrect": 2, "points": 6, "total": 8, "week": 2 } } } •Design considerations: •User centered model •“Picks” embedded array. •“weekly” embedded subdoc. (initialize first) ::CGD :: Data Schema :: Users Collection OverallResult facebook weekly UserPick overall WeeklyResult FacebookUser picks name id Model: users Users name id Model: FacebookUser facebook points outcome week game_id team Model: UserPick has_many_and_belongs_to_m any :teams picks percentage total incorrect correct Model: OverallResult has_many_and_belongs_to_m any :teams overall week total incorrect correct Model: WeeklyResult weekly Tuesday, January 3, 2012
  16. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 8 games { "_id": { "$oid": "4e583e9e8da1f7a81b000000" }, "away": { "image": “”, "name": "Miami (FL)", "rank": 0, "record": "0-0", "thumbnail": "" }, "channel": "ESPN/ESPN3.com", "combined_score": 0, "home": { "image": "", "name": "Maryland", "rank": 0, "record": "0-0", "thumbnail": "" }, "order": 0, "points": 1, "time": "Mon, 05 Sep 2011 17:00:00 GMT -07:00", "week": 1, "winner": "Maryland" } •Design considerations: •“Groups” embeds members by status (active, banned, pending) •“Groups” contains an array of tags for keyword searching. • “Game” embeds Home & Away teams along with stats & ranks. ::CGD :: Data Schema :: Groups & Game Collection away id winner week Team home Team Model: Game Game record rank name Model: Team Team status GroupMembers id tags members array[] Model: Groups Groups array[] pending banned array[] active name array[] Model: groupmembers GroupMembers groups { "_id": { "$oid": "4e612f624c70d77703000002" }, "name": "Mark's Game Day Buds!", "status": "private", "admins": [ "4e612e924c70d77404000001" ], "members": { "active": [“4e612e924c70d77404000001"4e6123e924c70d77404000001”, 4e612e924c70d77435000001 ], "banned": [], "pending": [], "total_active": 1 }, "system_tags": [ "mark's", "game", "day", "buds!" ], "about": Join at your own risk", "is_featured": false, } Tuesday, January 3, 2012
  17. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 9 :: College GameDay Facebook App:: Key learnings Tuesday, January 3, 2012
  18. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 9 :: College GameDay Facebook App:: Key learnings •Key learnings •If using EC2, ensure replica sets are running in multiple zones. •Don’t worry about data normalization, embed subdocs when possible •Use array subdocs when: •Highest flexibility in queries. •Filtering on the client doesn’t pose a problem. •Updates are not limited by positional operator •Improve query performance by initializing data (size/type) upon initial insert. •ODM will make your life much easier. Tuesday, January 3, 2012
  19. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 10 CGD FACEBOOK APP •Product research •Consumer focused •Mix of CMS content, and feeds YOKOHAMA TIRE SITE :: Case Studies Tuesday, January 3, 2012
  20. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 10 •Product research •Consumer focused •Mix of CMS content, and feeds YOKOHAMA TIRE SITE :: Case Studies Tuesday, January 3, 2012
  21. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 11 :: Yokohama Tire Site: Case Study #2: Yokohama Tire Content Mgmt System Car Catalog Product/Tire Catalog Dealer Locator Tuesday, January 3, 2012
  22. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 12 :: Yokohama Site::Distribution and Data •Running in EC2 (similar to CGD setup) •Application Stack PHP CodeIgniter Doctrine MySQL (PECL) MongoDB Expression Engine TIRE FEED (NIGHTLY) <?xml version="1.0"?> <!--Created by PRO_PTS_PRT_SELECTOR--> <!--Create date: 2011/01/18--> <PART_LIST> <PART> <PART_NO>10191</PART_NO> <PROD_LINE>HPT</PROD_LINE> <TRD_DES>SPORT</TRD_DES> <SIZE>275/45R19</SIZE> <UTQG>280/AA/A</UTQG> <COMPOUND/> <PLY_RATING/> <PLY/> <LOAD_INDEX>108</LOAD_INDEX> <SPEED_RATING>Y</SPEED_RATING> <SPEED_RT_NUMERIC>12</SPEED_RT_NUMERIC> <TREAD_DEPTH>10</TREAD_DEPTH> <SDWL_TRA>BW</SDWL_TRA> <SHIP_WT>35.05</SHIP_WT> <OEM_PART>1</OEM_PART> <RIM_SIZE>19</RIM_SIZE> <ASPECT_RATIO>45</ASPECT_RATIO> <SECT_WDTH>275</SECT_WDTH> <STS>A</STS> <BRAND>Y</BRAND> <LOAD_RANGE>XL</LOAD_RANGE> <TREAD_TYPE/> <PUBLISHED_PART_DESC>275/45R19 108Y XL</PUBLISHED_PART_DESC> <MARKETING_TREAD_NAME>ADVAN SPORT</MARKETING_TREAD_NAME> <LAST_OE_YEAR>2011</LAST_OE_YEAR> <OE_APPLICATIONS> <OE_APP> <OE_VEH_MAKE>PORSCHE</OE_VEH_MAKE> <OE_VEH_MODEL>CAYENNE</OE_VEH_MODEL> <OE_YEAR>2011</OE_YEAR> </OE_APP> <OE_APP> <OE_VEH_MAKE>VOLKSWAGEN</OE_VEH_MAKE> <OE_VEH_MODEL>TOUAREG</OE_VEH_MODEL> <OE_YEAR>2010</OE_YEAR> </OE_APP> </OE_APPLICATIONS> </PART> </PART_LIST> DISTRIBUTION Tuesday, January 3, 2012
  23. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 13 Design considerations: •“tires” is a nightly XML feed of products. •Feed structure may change without notice. •“tireclass” separately maintained CSV for classifying “tires” :: Yokohama Site::Data :: Tires Collection tires { "_id" : ObjectId( "4ed73c37cf7751f859000116" ), "alt_rim_size" : "8.25", "aspect_ratio" : 75, "category" : { "vehicle_type" : [ "MEDIUM TRUCK" ], "style" : [], "position" : [ "All-Position", "Steer" ], "application" : [ "Regional Haul", "Pick-up & Delivery" ], "business_type" : "commercial" }, "commercial" : [ { "type" : "medium truck", "position" : "all-position", "application" : "regional haul" }, { "type" : "medium truck", "position" : "all-position", "application" : "pick-up & delivery" }, "load_rev_mile" : 517, "load_stat_rad" : 18.8, "marketing_tread_name" : "103ZR", "weight" : 117.5, "width" : 295 } name style array array application size array category vehicle_type id array Model: tires tires array vehicle_type array application array style name Model: tire_classification tireclass Product/Tire Catalog Example query tire after tireclass has been merged. ... $query = $this->doctrine->createQuery('Documents\Tire') ->field('category.vehicle_type')->in((array) $vehicle_type) ->field('category.business_type')->equals($type); $this->_exclude_odd_tires($query); $result = $query->getQuery()->execute(); ... Tuesday, January 3, 2012
  24. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 14 Design considerations •Nightly XML feed of dealers. •Structure may change without notice. •New dealers are geocoded on import. •Leverages MongoDB spatial queries. :: Yokohama Site::Data :: Dealers Collection dealer_locations { "_id" : ObjectId ( "4d75834fcf77513f7700000a" ), "location" : "348 W Main St, Branford, CT, 06405", "coordinates" : { "latitude" : 41.279467, "longitude" : -72.850324 } } coordinates location id Model: dealer_locations Dealer_locations lon lat Model: coordinates coordinates Dealer Locator Example : Geospatial query for Lat/lon on Dealers. <?php namespace Documents; /** * @Document(db="yokohama", collection="dealer_locations") */ class DealerLocation { $bus_unit = $search_commercial ? self::COMMERCIAL_CODE : self::CONSUMER_CODE; $query = $this->doctrine->createQuery('Documents\Dealer') ->field('coordinates')->withinCenter($center_lat, $center_lon, $radius)- >field('bus_unit')->equals($bus_unit); ... return $query->sort('distance', 'asc') ->limit(26) ->getQuery() ->execute(); } Tuesday, January 3, 2012
  25. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 15 cars { "_id" : ObjectId ( "4e8a6a4fe1d9456d24000009" ), "car_tire_id" : 67557, "make" : "Audi", "model" : "Q7", "year" : 2007, "option" : "4.2 (20 Inch Option)", "fit" : "A", "frb" : "B", "load_index" : 110, "load_desc" : "RF", "size" : "275/45R20", "size_desc" : "275/45R20/RF 110Y", "width" : 275, "aspect_ratio" : 45, "rim_size" : 20, "speed_rating" : "Y", "speed_rating_numeric" : 12 } :: Yokohama Site::Data :: Cars Collection Design considerations: •Separately maintained CSV. •Heavy remapping of fields. •Imported into MongoDB size_desc id make model year Model: cars cars Car Catalog <?php namespace Documents; /** * @Document(db="yokohama", collection="cars") * @HasLifecycleCallbacks */ class Car { /** @Id */ private $id; /** @String @Index(order="asc") */ private $make; /** @String @Index(order="asc") */ private $model; /** @Int @Index(order="asc") */ private $year; /** @String */ private $size_desc; /** @Int @Index(order="asc") */ private $width; /** @String @Index(order="asc") */ private $speed_rating; /** @PrePersist */ public function parse_speed_rating() { // @TODO handle this style -> LT235/85R16/E if (empty($this->speed_rating)) { $slash_pos = strpos($this->size_desc, '/'); $speed_rating = substr($this->size_desc, $slash_pos + 3, 1); // alpha letter but not R if ((preg_match( '/[A-QS-Z]/i', $speed_rating))) {$this->set_speed_rating($speed_rating);} } $this->set_speed_rating_numeric(); return; } Tuesday, January 3, 2012
  26. MongoSV: Scaling & Flexibility U N I T E D

    F U T U R E 16 :: Yokohama Site::Data :: Key learnings •Key learnings •MongoDB’s flexible schema is a perfect fit for XML data imports. Using RDBMS would be far more complex. •When dealing with “unscrubbed” data, Doctrine comes to the rescue with “PrePersist” behaviors. •MongoDB’s geospatial support can be handy for location based data. •And yes, MongoDB rocks!! Tuesday, January 3, 2012