Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pairing MongoDB with NAND Flash for Real Time Results

mongodb
January 03, 2012

Pairing MongoDB with NAND Flash for Real Time Results

MongoSV 2011

Today's real-time world leaves nothing to wait. With more data to process, and less time at hand, pairing MongoDB with Fusion-io flash memory solutions is a winning combination. Meet a couple of customers – Kontera and Aggregate Knowledge – and learn how you can harness the power of MongoDB along with the speed of NAND flash memory today!

mongodb

January 03, 2012
Tweet

More Decks by mongodb

Other Decks in Technology

Transcript

  1. D E C E M B E R   9

    ,   2 0 1 1  
  2. PAIRING MONGODB WITH NAND FLASH FOR REAL TIME RESULTS Greg

     Pendler        Produc(on  and  Opera(ons  Manager     Dale  Russell        Technical  Consultant   Brian  Knox        Service  Delivery  Data  Architect   S P E A K I N G   T O D A Y  
  3. 3   Kontera: Real-time Analytics in a Flash December 2011

    Greg Pendler December  9,  2011  
  4. 5   December  9,  2011   WRITE BOTTLENECK FOR NEW

    PAGE ANALYSIS MongoDB 1   2   Content matching and persistent matching recipe New Page Detected 3   Appropriate context-based ads delivered quickly Reading Content Page
  5. 6   December  9,  2011   THE IMPORTANCE OF FAST

    RESPONSE •  Internet traffic spikes •  Seconds can mean many lost click-through opportunities 1 hour 1 day 1 week 1 month 1 year Traffic Time after posting
  6. 7   December  9,  2011   HIGH PERFORMANCE CONTENT DELIVERY

    Application Layer Caching Layer (Memcache) Database/Analysis Layer (MongoDB) Internet
  7. 8   December  9,  2011   INITIAL ATTEMPTS iSCSI initiator

    iSCSI target 6 local 15K SAS drives Dell R710, 72GB Memory
  8. 9   December  9,  2011   MISSION-CRITICAL DATA NEEDS ENTERPRISE

    QUALITY SOLID-STATE Implemented a competing “enterprise” solid-state product before Fusion-io •  50% failure rate in 90 days •  SSD even died during migration to Fusion-io •  ioDrives running around 4 months without a hiccup
  9. 10   December  9,  2011   SPEEDING WRITES WITH FUSION-IO

    •  Store the 360GB database on 640 GB ioDrives (gives room to grow) •  3 slaves in different geographic locations for redundancy
  10. 11   December  9,  2011   KONTERA SYNAPSE ANALYTICS SYSTEM

    Loca2on  1   MongoDB   MongoDB   640GB  ioDrive   640GB  ioDrive   Master   Slave   Loca2on  2   MongoDB   MongoDB   640GB  ioDrive   640GB  ioDrive   Slave   Slave  
  11. 12   December  9,  2011   THE RESULTS •  Queries

    that used to take hours, now take seconds •  No fear of memcached layer interruption –  Fusion Powered MongoDB handled reboot of memcached layer flawlessly
  12. AnalyOcs  +  AQribuOon  =     AcOonable  Insights   MongoDB

    with Fusion-io: Go Vertical… December 2011 Brian Knox Dale Russell 13   December  9,  2011  
  13. 14   December  9,  2011   GETTING TO REAL TIME

    STREAMING §  How do you ingest, store, and report §  With lots of data if you wait to process after you are too late §  Rsyslog + ZeroMQ + AK secret sauce §  Convert syslog into event message routing system §  ZeroMQ takes events, turns into objects for object rating system §  90% of decisions about what you want to do with data happens before you get to database
  14. 15   December  9,  2011   PUB-SUB ZEROMQ INTEGRATION THIS

    IS COOL! Explore a tap into the event stream. §  New algorithm? Put ZeroMQ on the front and start listening. §  Need a few minutes worth of events as they come in? Connect and take what you need. NO MORE! Going off to the log server, finding the logs, parsing them, breaking them up, etc. TAP AND GO!
  15. 17   December  9,  2011   BASICS TO A BILLION

    1.  Events are generated at the edge by incoming http requests 2.  Events are sent from the edge servers over Rsyslog 3.  Events are forwarded over Rsyslog to an event ventilator (another Rsyslog instance) 4.  The ventilator balances work across a cluster of "enrichment" processes 5.  The enrichment processes pull in additional information about the event from a Mongodb cluster running on Fusion-io 6.  The enriched events are sent to a "summarizer" (the AK secret sauce) that does all sorts of real time aggregation and correlation 7.  The secret sauce data structures generated by the summarizer are sent to a modified PostgreSQL database 8.  The front end can generate reports on the fly (no need for batch) in great detail off the distilled data Searching 1 Billion Documents
  16. ARCHITECTURE CHART 18   December  9,  2011   HTTP Servers

    Enrichers Report Servers Event Router Report Database Summarizer Enrichment Store
  17. MONGODB CLUSTER S1M S2M S3M S4M S5M S6M S7M S8M

    S9M S10M S11M S12M S13M S14M S15M S16M S1S S2S S3S S4S S5S S6S S7S S8S S9S S10S S11S S12S S13S S14S S15S S16S LDR LDR LDR LDR S1A S2A S3A S4A S5A S6A S7A S8A S8A S10A S11A S12A S13A S14A S15A S16A CONFIGDB LDR LDR LDR LDR S9A S10A S11A S12A S13A S14A S15A S16A S1A S2A S3A S4A S5A S6A S7A S8A CONFIGDB 02_OLTP UI_OLTP CONFIGDB 19   December  9,  2011  
  18. 20   December  9,  2011   WHY FUSION-IO? Before Fusion-io:

    §  Reporting infrastructure was a large PostgreSQL data warehouse: events came in, got shoved into the db, reports were generated by a batch report system §  Doing unique user stats across large numbers of dimensions could take hours Why Fusion-io? §  Fusion-io gives low latency when correlating in band data with out of band data sources on the fly, which allows streaming data enrichment §  Possibility to run streaming algorithms against the incoming pre-enriched data §  Reporting databases can run on relatively cheap hardware on local disks Example: A report that took 40 hours to run now can be run in a few seconds §  Possibility to report so quickly off of Postgres on relatively commodity hardware, whereas before was a large data warehouse on an expensive SAN
  19. TEST HARDWARE AND PARAMETERS §  HP DL580R07 CTO Chassis § 

    HP X7542 DL580 G7 2P Fusion-io kit §  HP 8GB (1x8GB) Dual Rank x4 PC3-10600 (DDR3-1333) Registered CAS-9 Memory Kit (total = 256GB) §  HP 146GB 15K 6G 2.5 SAS DP HDD x 8 §  Local disk test –  Six 15K/RPM drives in RAID-10 §  Fusion-io test –  Two 640GB ioDrive Duos §  1 billion records – long integer pkey §  4 shards §  Randomly generate ID to search §  Find and pull back the records §  32 processes §  15 minute test time 21   December  9,  2011  
  20. LOCAL DISK LOOKUP LESS THAN 1 MILLION   §  734,687

    transactions §  Average TPS 816 §  Poor latency and response time §  Not even 1 million transactions in 15-min test window Response Time Transactions per Second 22   December  9,  2011  
  21. FUSION-IO LOOKUP 57 MILLIONS   Response Time Transactions per Second

    §  57,307,403 transactions §  Average TPS 63,674 §  Near memory speed §  Ramps to full rate in about 30 sec §  Fast §  Stable and predictable §  100x scale 23   December  9,  2011  
  22. ACHIEVING REAL TIME RESULTS §  78x average throughput increase using

    Fusion-io vs. traditional storage §  Fusion-io enabled sub millisecond search times on 1 billion documents §  Fusion-io allows us to scale vertically before having to go horizontal §  Fusion-io gives us predictable scalability with reliable response times 24   December  9,  2011