MongoDB DC 2012: Why We Chose MongoDB to Put Big Data "On The Map"

WHY$WE$CHOSE$MONGODB$TO$$ PUT$BIG2DATA$‘ON$THE$MAP’$ JUNE$2012$ $ $ $ $ $ $ $
@nknize$ +Nicholas$Knize$

“The%3D%UDOP%allows%near%real%2me%visibility%of%all%SOUTHCOM%Directorates%informa2on%in%one% loca2on…this%capability%allows%for%unprecedented%situa2onal%awareness%and%informa2on%sharing”% % % % % % % % %
% % % %EGen.%Doug%Frasier% TST PRODUCTS ACCOMPLISHING$THE$IMPOSSIBLE$

•  Expose$enterprise$data$in$a$geo2temporal$user$defined$ environment$ •  Provide$a$flexible$and$scalable$spaUal$indexing$framework$ for$heterogeneous$data$$ •  Visualize$spaUally$referenced$data$on$3D$globe$&$2D$maps$ •  Manage$real2Ume$data$feeds$and$mobile$messaging$$
•  View$data$over$geo2recUfied$imagery$with$3D$terrain$ •  Support$mission$planning$and$simulaUon$ •  Provide$real2Ume$collaboraUon$and$sharing$ ISPATIAL OVERVIEW ACCOMPLISHING$THE$IMPOSSIBLE$

•  Horizontally$scalable$–$Large$volume$/$elasUc$ •  VerUcally$scalable$–$Heterogeneous$data$types$(“Data$Stack”)$ •  Smartly$Distributed$–$Reduce$the$distance$bits$must$travel$ •  Fault$Tolerant$–$ReplicaUon$Strategy$and$Consistency$model$ •  High$Availability$–$Node$recovery$
•  Fast$–$Reads$or$writes$(can’t$always$have$both)$ BIG DATA STORAGE CHARACTERISTICS ACCOMPLISHING$THE$IMPOSSIBLE$ $$$$Desired$Data$Store$CharacterisUc$for$‘Big$Data’$

•  Cassandra$ –  Nice$Bring$Your$Own$Index$(BYOI)$design$ –  …$but$Java,$Java,$Java…$Memory$management$can$be$an$issue$ –  Adding$new$nodes$can$be$a$pain$(Token$Changes,$nodetool)$ –  Key2Value$store…good$for$simple$data$models$
•  Hbase$ –  Nice$BigTable$model$ –  Theory$grounded$heavily$in$C.A.P,$inﬂexible$trade2oﬀs$ –  Complicated$setup$and$maintenance$$$ •  CouchDB$ –  Provides$some$GeoSpaUal$funcUonality$ –  HEAVILY$dependent$on$Map2Reduce$model$(complicated$design)$ –  Erlang$based$–$poor$mulU2threaded$heap$management$ $ NOSQL OPTIONS ACCOMPLISHING$THE$IMPOSSIBLE$ Subset$of$Evaluated$NoSQL$OpUons$

$$$$Why$MongoDB$for$Thermopylae?$ •  Documents$based$on$Javascript$Object$NotaUon$(JSON)$–$A$GEOJSON$ match$made$in$heaven!$ $ •  C++$2$No$Garbage$CollecUon$Overhead!$$Eﬃcient$memory$management$ design$reduces$disk$swapping$and$paging$ •  Disk$storage$is$memory$mapped,$enabling$fast$swapping$when$necessary$$
$ •  Built$in$auto2failover$with$replica$sets$and$fast$recovery$with$journaling$ •  Tunable$Consistency$–$Consistency$deﬁned$at$applicaUon$layer$ •  Schema$Flexible$–$friendly$properUes$of$SQL$enable$easy$port$ •  Provided$iniUal$spaUal$indexing$support$–$Point$based$limited!$ $ WHY TST LIKES MONGODB ACCOMPLISHING$THE$IMPOSSIBLE$

MONGODB SPATIAL INDEXER ACCOMPLISHING$THE$IMPOSSIBLE$ $$$...$The$SpaUal$Indexer$wasn’t$quite$right$ •  MongoDB$(like$nearly$all$relaUonal$DBs)$uses$a$b2Tree$$ –  Data$structure$for$storing$sorted$data$in$log$Ume$ – 
Great$for$indexing$numerical$and$text$documents$(anribute$data)$ –  Cannot$store$mulU2dimension$(>2)$data$–$NOT$COMPLEX$GEOMETRY$ FRIENDLY$

DIMENSIONALITY REDUCTION ACCOMPLISHING$THE$IMPOSSIBLE$ How$does$MongoDB$solve$the$dimensionality$problem?$$ •  Space$Filling$(Z)$Curve$$ –  A$conUnuous$line$that$ intersects$every$point$in$a$ two2dimensional$plane$
•  Use$Geohash$to$ represent$lat/lon$values$ –  Interleave$the$bits$of$a$ lat/long$pair$ –  Base32$encode$the$result$

GEOHASH BTREE ISSUES ACCOMPLISHING$THE$IMPOSSIBLE$ •  Neighbors$aren’t$so$ close!$ –  Neighboring$points$on$the$ Geoid$may$end$up$on$
opposite$ends$of$the$ plane$ –  Impacts$search$eﬃciency$ •  What$about$Geometry?$ –  Doesn’t$support$>$2D$ –  Mongo$uses$MulU2 LocaUon$documents$ which$really$just$indexes$ mulUple$points$that$link$ back$to$a$single$document$ $$$$Issues$with$the$Geohash$b2Tree$approach$

Case 3: Case 4: Multi-Location Document (aka. Polygon) Search Polygon
Case 1: Case 2: Success! Success! Fail! Fail! Mongo$MulU2locaUon$Document$Clipping$Issues$ ($within$search$doesn’t$always$work$w/$mulU2locaUon)$ MULTI-LOCATION CLIPPING ACCOMPLISHING$THE$IMPOSSIBLE$

•  Constrain$the$system$to$single$point$searches$ –  MulU2dimension$support$will$be$exponenUally$complex$(won’t$scale)$ $ $ •  Interpolate$points$along$the$edge$of$the$shape$ –  MulU2dimension$support$will$be$exponenUally$complex$(won’t$scale)$
•  Customize$the$spaUal$indexer$ –  Selected$approach$ SOLUTIONS TO GEOHASH PROBLEM ACCOMPLISHING$THE$IMPOSSIBLE$ $$$$PotenUal$SoluUons$

CUSTOM TUNED SPATIAL INDEXER ACCOMPLISHING$THE$IMPOSSIBLE$ Thermopylae$Custom$Tuned$MongoDB$$$$$$for$Geo$ TST$Leverage’s$Gunman’s$1984$Research$in$R/R*$Trees$ •  R2Trees$organize$any2dimensional$data$by$represenUng$ the$data$as$a$minimum$bounding$box.$$
•  Each$node$bounds$its$children.$$A$node$can$have$many$ objects$in$it$(max:$m$$$min:$$ceil(m/2)%)$ •  Splits$and$merges$opUmized$by$minimizing$overlaps$ •  The$leaves$point$to$the$actual$objects$(stored$on$disk$ probably)$ •  Height$balanced$–$search$is$always$O(log$n)$$

SpaUal$Indexing$at$Scale$with$R2Trees$ RTREE THEORY ACCOMPLISHING$THE$IMPOSSIBLE$ SpaUal$data$represented$as$minimum$bounding$rectangles$(22dimension),$ cubes$(32dimension),$hexadecant$(42dimension)! ! Index$represented$as:$$$<I,$DiskLoc>$$where:$ ! !I$=$(I0
,$I1 ,$…$In )$$$:$$n$=$number$of$dimensions$ !Each$I$is$a$set$in$the$form$of$[min,max]$describing$MBR$range$along$a$dimension$ $ $! !

R*-Tree Spatial Index Example •  Sample insertion result for 4th
order tree •  Objectives: 1.  Minimize area 2.  Minimize overlaps 3.  Minimize margins 4.  Maximize inner node utilization a b c d e f g h i j k l m n o p R*-TREE INDEX OBJECTIVES ACCOMPLISHING$THE$IMPOSSIBLE$

Insert •  Similar to insertion into B+-tree but may insert
into any leaf; leaf splits in case capacity exceeded. –  Which leaf to insert into? –  How to split a node? R*-TREE INSERT EXAMPLE ACCOMPLISHING$THE$IMPOSSIBLE$

Insert—Leaf Selection •  Follow a path from root to leaf.
•  At each node move into subtree whose MBR area increases least with addition of new rectangle. m n o p

Insert—Leaf Selection •  Insert into m. m

Insert—Leaf Selection •  Insert into n. n

Insert—Leaf Selection •  Insert into o. o

Insert—Leaf Selection •  Insert into p. p

m n o p a! a! a! x a b
c d e f g h i j k l m n o p Query •  Start at root •  Find all overlapping MBRs •  Search subtrees recursively

Query •  Search m. m n o p a! a!
x x a b c d e f g h i j k l m n o p a! a! a b c d e g

R*2Tree$Leverages$B2Tree$Base$Data$Structures$(buckets)$ R*-TREE MONGODB IMPLEMENTATION ACCOMPLISHING$THE$IMPOSSIBLE$

Geo2Sharding$–$(in%work)$ $ $Scalable$Distributed$R*$Tree$(SD2r*Tree)$ Balanced$binary$tree,$ distributed$on$a$set$of$ servers:$ $ •  Each$internal$node$has$ exactly$two$children$
$ •  Each$leaf$node$stores$a$ subset$of$the$indexed$ dataset$ $ •  At$each$node,$the$height$ of$the$subtrees$diﬀer$by$ at$most$one$ $ •  Each$server$stores$one$ data$node$and$one$ “rouUng”$node$ GEO-SHARDING ACCOMPLISHING$THE$IMPOSSIBLE$

d0! d1! r1! d0! Data!Node! Spa.al!! Coverage! a! a! b!
c! c! b! d0! r1! a! b! c! c! b! d2! d1! e! d! d! r2! e! SD2r*Tree$Data$Structure$IllustraUon$$ •  di$ =$Data$Node$(Chunk)$ •  ri$ =$Coverage$Node$ $ Leveraged$work$from$Litwin,$Mouza,$Rigaux$2007$ SD-r*Tree DATA STRUCTURE ACCOMPLISHING$THE$IMPOSSIBLE$

SD2r*Tree$Structure$DistribuUon$ d0! r1! a! b! c! c! b! d2! d1!
e! d! d! r2! e! r2! d1! d2! d0! r1! GeoShard!2! GeoShard!3! GeoShard!1! mongos! SD-r*TREE STRUCTURE DISTRIBUTION ACCOMPLISHING$THE$IMPOSSIBLE$

GeoSharding$AlternaUve$–$3D$/$4D$Hilbert$Scanning$Order$ GEO-SHARDING ALTERNATIVE ACCOMPLISHING$THE$IMPOSSIBLE$

Next$Steps:$Beyond$42Dimensions$2$X2Tree$ (Berchtold,$Keim,$Kriegel$–$1996)$$ Normal Internal Nodes Supernodes Data Nodes •  Avoid$MBR$overlaps$
$ •  Avoid$node$splits$(main$cause$for$high$overlap)$ $ •  Introduce$new$node$structure:$Supernodes!–$Large$Directory$nodes$of$variable$size$ BEYOND 4-DIMENSIONS ACCOMPLISHING$THE$IMPOSSIBLE$

X-TREE PERFORMANCE ACCOMPLISHING$THE$IMPOSSIBLE$ X2Tree$Performance$Results$ (Berchtold,$Keim,$Kriegel$–$1996)$$

T2Sciences$Custom$Tuned$SpaUal$Indexer$ •  OpUmized$SpaUal$Search$–$Finds$intersecUng$MBR$and$recurses$into$ those$nodes$ $ •  OpUmized$SpaUal$Inserts$–$Uses$the$Hilbert$Value$of$MBR$centroid$to$ guide$search$$ –  28%$reducUon$in$number$of$nodes$touched$
•  OpUmize$Deletes$–$Leverages$R*$split/merge$approach$for$rebalancing$ tree$when$nodes$become$over/under2full$ •  Low$maintenance$–$Leverages$MongoDB’s$automaUc$data$compacUon$ and$parUUoning$ CONCLUSION ACCOMPLISHING$THE$IMPOSSIBLE$

Example$Use$Case$–$OSINT$(Foursquare$Data)$ •  Sample Foursquare data set mashed with Government Intel
Data •  1 million Geo Document test (points and polys) •  4 server replica set •  ~350ms query response •  ~300% improvement over PostGIS EXAMPLE ACCOMPLISHING$THE$IMPOSSIBLE$

Community$Support$ •  Thermopylae$contributes$ﬁxes$to$the$codebase$ –  hnp://github.com/mongodb$ •  TST$will$work$with$10gen$to$fold$into$the$baseline$ $ •  AcUve$developer$collaboraUon$
–  IRC:$#mongodb$$$freenode.net$ $ FIND US ACCOMPLISHING$THE$IMPOSSIBLE$

$ THANK$YOU$ QuesUons?$ $ Nicholas$Knize$ [email protected]$ THANK YOU ACCOMPLISHING$THE$IMPOSSIBLE$

$ Backup$ $

Thermopylae$Sciences$&$Technology$–$Who$are$we?$ •  Advanced$technology$w/$160+$employees$ •  Core$customers$in$naUonal$security,$venues$and$ events,$military$and$police,$and$city$planning$ •  Partnered$with$Google$and$imagery$providers$ •  Long$term$relaUonship$focused$–$TS/SCI$Staﬀ$
$$$$$$$$TST$+$10gen$+$Google$=$Game2changing$approach$ WHO ARE THESE GUYS? ACCOMPLISHING$THE$IMPOSSIBLE$ ENTERPRISE PARTNER

Key$Customers$2$Government $$ •  US$Dept$of$State$Bureau$of$DiplomaUc$Security$ –  Build$and$support$30$TB$Google$Earth$Globe$with$mulU2 terabytes$of$individual$globes$sent$to$embassies$throughout$ the$world.$$Integrated$Google$Earth$and$iSpaUal$framework.$ •  US$Army$Intelligence$Security$Command$
–  Provide$experUse$in$managing$technology$integraUon$–$ prime$contractor$providing$operaUons,$intelligence,$and$IT$ support$worldwide.$$Partners$include$IBM,$Lockheed$MarUn,$ Google,$MIT,$Carnegie$Mellon.$$Integrated$Google$Earth$and$ iSpaUal$framework.$ •  US$Southern$Command$ –  Coordinate$Intelligence$management$systems$spaUal$data$ collecUon,$indexing,$and$distribuUon.$$Integrated$Google$ Earth,$iSpaUal,$and$iHarvest.$ –  Index$large$volume$imagery$and$expose$it$for$diﬀerent$ services$(Air$Force,$Navy,$Army,$Marines,$Coast$Guard)$ $ GOVERNMENT CUSTOMERS ACCOMPLISHING$THE$IMPOSSIBLE$

COMMERCIAL CUSTOMERS ACCOMPLISHING$THE$IMPOSSIBLE$ Key$Customers$2$Commercial$$ Cleveland! Cavaliers! USGIF! Las!Vegas! Motor!Speedway! Bal.more!
Grand!Prix! iSpaUal$framework$serves$thousands$of$mobile$devices$

•  Banle$tested,$Banle$proven$–$RelaUonal$Model$dates$back$to$1969$ •  Plethora$of$RelaUonal$Experience$–$Full2Time$DBAs,$Training$&$Certs$ •  Company$Backed$–$Safe$choice$for$business$/$mission$criUcal$systems$ •  Fewer$AlternaUves$–$Non2relaUonal$is$a$5$year$old$know2it2all$ •  Mostly$Standardized$–$SQL$ISO/IEC$9075$Accepted$Standard$
•  TheoreUcally$Sound$–$Based$on$100$years$of$First2Order$Logic$theory$ RDBMS STRENGTHS ACCOMPLISHING$THE$IMPOSSIBLE$ $$$$RDBMS$Strengths$

•  Atomicity$–$If$one$fails,$we$all$fail!$$$$ •  Consistency$–$All$data$constraints$(normalized$schema)$cascades,$ triggers,$etc.$must$be$met$before$transacUon$succeeds.$(LATENCY)$ •  IsolaUon$–$SynchronizaUon,$no$operaUon$can$see$a$transacUon$that$ hasn’t$yet$completed$ •  Durability$–$Once$a$transacUon$is$commined$it$will$remain$commined$
even$in$power$loss$crashes$or$other$hardware$errors.$ ACID THEORY ACCOMPLISHING$THE$IMPOSSIBLE$ $$$$RelaUonal$on$ACID$

$ •  Writes$are$accomplished$using$in2place$update$on$disk$(crazy$disk$ swapping$rate)$ $ •  Table$joins,$updates,$and$large$queries$quickly$outgrow$disk$cache$ requiring$many$random$disk$seeks$(performance$bonleneck!!)$ •  Strict$consistency$requirements$impacts$scalability$(e.g.$Postgres$
uses$MulUversion$Consistency,$commonly$resulUng$in$stale$data)$ •  As$data$centers$grow,$the$probability$of$node$failure$(due$to$Disk$ Writes,$Consistency,$and$Atomic$operaUons)$increases$ $ RDBMS WEAKNESSES ACCOMPLISHING$THE$IMPOSSIBLE$ RDBMS$Weaknesses$

Why$NoSQL?!?$ (CAVEATS)$ •  Use$the$right$tool$for$the$job$ WHY NOSQL? ACCOMPLISHING$THE$IMPOSSIBLE$ •  Understand$your$needs!$ • 
RelaUonal$is$not$always$bad$ Engineering!with!Constraints! Unbounded!Engineering!

MongoDB DC 2012: Why We Chose MongoDB to Put Bi...

MongoDB DC 2012: Why We Chose MongoDB to Put Big Data "On The Map"

mongodb

More Decks by mongodb

Featured

Transcript

WHY$WE$CHOSE$MONGODB$TO$$ PUT$BIG2DATA$‘ON$THE$MAP’$ JUNE$2012$ $ $ $ $ $ $ $

“The%3D%UDOP%allows%near%real%2me%visibility%of%all%SOUTHCOM%Directorates%informa2on%in%one% loca2on…this%capability%allows%for%unprecedented%situa2onal%awareness%and%informa2on%sharing”% % % % % % % % %

•  Expose$enterprise$data$in$a$geo2temporal$user$deﬁned$ environment$ •  Provide$a$ﬂexible$and$scalable$spaUal$indexing$framework$ for$heterogeneous$data$$ •  Visualize$spaUally$referenced$data$on$3D$globe$&$2D$maps$ •  Manage$real2Ume$data$feeds$and$mobile$messaging$$

•  Cassandra$ –  Nice$Bring$Your$Own$Index$(BYOI)$design$ –  …$but$Java,$Java,$Java…$Memory$management$can$be$an$issue$ –  Adding$new$nodes$can$be$a$pain$(Token$Changes,$nodetool)$ –  Key2Value$store…good$for$simple$data$models$

MONGODB SPATIAL INDEXER ACCOMPLISHING$THE$IMPOSSIBLE$ $$$...$The$SpaUal$Indexer$wasn’t$quite$right$ •  MongoDB$(like$nearly$all$relaUonal$DBs)$uses$a$b2Tree$$ –  Data$structure$for$storing$sorted$data$in$log$Ume$ –

DIMENSIONALITY REDUCTION ACCOMPLISHING$THE$IMPOSSIBLE$ How$does$MongoDB$solve$the$dimensionality$problem?$$ •  Space$Filling$(Z)$Curve$$ –  A$conUnuous$line$that$ intersects$every$point$in$a$ two2dimensional$plane$

GEOHASH BTREE ISSUES ACCOMPLISHING$THE$IMPOSSIBLE$ •  Neighbors$aren’t$so$ close!$ –  Neighboring$points$on$the$ Geoid$may$end$up$on$

Case 3: Case 4: Multi-Location Document (aka. Polygon) Search Polygon

•  Constrain$the$system$to$single$point$searches$ –  MulU2dimension$support$will$be$exponenUally$complex$(won’t$scale)$ $ $ •  Interpolate$points$along$the$edge$of$the$shape$ –  MulU2dimension$support$will$be$exponenUally$complex$(won’t$scale)$

CUSTOM TUNED SPATIAL INDEXER ACCOMPLISHING$THE$IMPOSSIBLE$ Thermopylae$Custom$Tuned$MongoDB$$$$$$for$Geo$ TST$Leverage’s$Gunman’s$1984$Research$in$R/R*$Trees$ •  R2Trees$organize$any2dimensional$data$by$represenUng$ the$data$as$a$minimum$bounding$box.$$

SpaUal$Indexing$at$Scale$with$R2Trees$ RTREE THEORY ACCOMPLISHING$THE$IMPOSSIBLE$ SpaUal$data$represented$as$minimum$bounding$rectangles$(22dimension),$ cubes$(32dimension),$hexadecant$(42dimension)! ! Index$represented$as:$$$<I,$DiskLoc>$$where:$ ! !I$=$(I0

R*-Tree Spatial Index Example •  Sample insertion result for 4th

Insert •  Similar to insertion into B+-tree but may insert

Insert—Leaf Selection •  Follow a path from root to leaf.

Insert—Leaf Selection •  Insert into m. m

Insert—Leaf Selection •  Insert into n. n

Insert—Leaf Selection •  Insert into o. o

Insert—Leaf Selection •  Insert into p. p

m n o p a! a! a! x a b

Query •  Search m. m n o p a! a!

R2Tree$Leverages$B2Tree$Base$Data$Structures$(buckets)$ R-TREE MONGODB IMPLEMENTATION ACCOMPLISHING$THE$IMPOSSIBLE$

Geo2Sharding$–$(in%work)$ $ $Scalable$Distributed$R$Tree$(SD2rTree)$ Balanced$binary$tree,$ distributed$on$a$set$of$ servers:$ $ •  Each$internal$node$has$ exactly$two$children$

d0! d1! r1! d0! Data!Node! Spa.al!! Coverage! a! a! b!

SD2r*Tree$Structure$DistribuUon$ d0! r1! a! b! c! c! b! d2! d1!

GeoSharding$AlternaUve$–$3D$/$4D$Hilbert$Scanning$Order$ GEO-SHARDING ALTERNATIVE ACCOMPLISHING$THE$IMPOSSIBLE$

Next$Steps:$Beyond$42Dimensions$2$X2Tree$ (Berchtold,$Keim,$Kriegel$–$1996)$$ Normal Internal Nodes Supernodes Data Nodes •  Avoid$MBR$overlaps$

X-TREE PERFORMANCE ACCOMPLISHING$THE$IMPOSSIBLE$ X2Tree$Performance$Results$ (Berchtold,$Keim,$Kriegel$–$1996)$$

T2Sciences$Custom$Tuned$SpaUal$Indexer$ •  OpUmized$SpaUal$Search$–$Finds$intersecUng$MBR$and$recurses$into$ those$nodes$ $ •  OpUmized$SpaUal$Inserts$–$Uses$the$Hilbert$Value$of$MBR$centroid$to$ guide$search$$ –  28%$reducUon$in$number$of$nodes$touched$

Example$Use$Case$–$OSINT$(Foursquare$Data)$ •  Sample Foursquare data set mashed with Government Intel

Community$Support$ •  Thermopylae$contributes$ﬁxes$to$the$codebase$ –  hnp://github.com/mongodb$ •  TST$will$work$with$10gen$to$fold$into$the$baseline$ $ •  AcUve$developer$collaboraUon$

$ THANK$YOU$ QuesUons?$ $ Nicholas$Knize$ [email protected]$ THANK YOU ACCOMPLISHING$THE$IMPOSSIBLE$

$ Backup$ $

COMMERCIAL CUSTOMERS ACCOMPLISHING$THE$IMPOSSIBLE$ Key$Customers$2$Commercial$$ Cleveland! Cavaliers! USGIF! Las!Vegas! Motor!Speedway! Bal.more!

Why$NoSQL?!?$ (CAVEATS)$ •  Use$the$right$tool$for$the$job$ WHY NOSQL? ACCOMPLISHING$THE$IMPOSSIBLE$ •  Understand$your$needs!$ •