Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mobile Eye for the Big Data Guy

Mobile Eye for the Big Data Guy

My talk at the BigDataCon track at JAX 2012 in Mainz, Germany.

Avatar for Darach Ennis

Darach Ennis

April 18, 2012
Tweet

Other Decks in Technology

Transcript

  1. Copyright  Push  Technology  2012   [email protected]   About  me?  

    •  15  years  in  Distributed  Systems   •  Favorite  Prog  Lang:  Erlang   •  Alumni  of:  Motorola,  IONA   Technologies,  BeMair  (Sports   Exchange  &  Poker),  JP  Morgan  Chase   (Front  Office  Credit  &  Debit   derivaUves),  StreamBase  (CEP  –   SoluUons  Architect),  Push  Technology   (Stream  Oriented  Messaging)   •  Trinity  College  Dublin  Comp  Sci  +  M.Sc   in  Networks  &  Distributed  Systems   (DSG  group)   •  Responds  to:  Guinness,  Whisky  
  2. Copyright Push Technology 2012 •  UK  So&ware  Company  Founded  in

     2006   •  Offices  in  London  and  Maidenhead   •  2006-­‐2009  100%  growth,  2010-­‐2012  400%  growth,   •  Global  sales,  support  &  coverage  through  partners   •  Flagship  product  –  Diffusion  –  Low  latency  stream  oriented  messaging   - Capital  Markets:  Buy  Side,  Sell  Side,  Brokers,  Exchanges,  Market  Data  providers   - Media,  Social,  Online  and  Personalised  AdverUsing   - Online  Be\ng,  Gaming,  MMOG   About  Push  Technology  
  3. Twitter: @darachennis Agenda Mobile Big Data Corp. Streams Big Data

    Share TV Photos Content Media Betting Trading Target Ads Sensors Forensics Mining Supply Chain Conversations What if? Messaging WWW Mobile Internet of Things CEP NoSql OldSql WebSockets REST MMOG IoT NewSql SCALE!!! Speed! Access! Timeliness Intelligence Interactions Usability Caching
  4. Twitter: @darachennis Mobile: Dumb, Smart, Super? Sources – AnandTech -

    http://www.anandtech.com/show/5703/jenhsuns-email-to-nvidia-employees-on-a-successful-kepler-launch - Toms Hardware - http://www.tomshardware.com/news/nvidia-tegra-4-wayne-arm-a15,15261.html Kepler Tegra 4 •  ETA Q1 2013 •  28nm ARM A15 •  Quad+1 1.8GHz •  2x Tegra 3 •  10x Tegra 2
  5. Twitter: @darachennis y = -1.1667x + 45703 R² = 0.02496

    1.00E+00 1.00E+01 1.00E+02 1.00E+03 1.00E+04 Sep-10 Oct-10 Oct-10 Nov-10 Dec-10 Dec-10 Jan-11 Feb-11 Latency Latency (ms) Expected (ms) Linear (Latency (ms)) -10.00 0.00 10.00 20.00 30.00 Sep-10 Oct-10 Oct-10 Nov-10 Dec-10 Dec-10 Jan-11 Feb-11 Download Bandwidth Expected (Mbps) Download (Mbps) Mobile: The Last mile? HARD Buffer Bloat Network Saturation Battery!
  6. Twitter: @darachennis THE INTERNET… JUST GOT? A long time ago

    1992 – ‘Surfing the internet’ - Coined by Jean Polly About now - 119.3M Mobile Users. 2011? 97.3M Smart Phone? 106.7M Passive Active
  7. Twitter: @darachennis … Mobile. Internet. 2012. •  242.6M Mobile Users

    by EOY •  72.8M Shoppers, 184.3M by EOY •  54.8M Tablet Users [Up 62.8%, 41.9M iPad] •  45.6M eReader Users •  169.3M Video viewers, 54.6M Mobile Source – HubSpot Blog - http://bit.ly/wRx5au
  8. Twitter: @darachennis Draw Something •  Week 1 – 1M users

    after 9 days, 6 Servers, 50 Drawings per second •  Week 6 – 11M users, 90 Servers, 3K Drawings per second, 2B total drawings –  Sold to Zynga for an estimated $210 million US dollars •  Guesstimate: 42 TB of drawing data after 6 weeks or ~580GB per server Assumption: 10KB average size of drawing data Big Mobile Data – Big as in (ELASTIC?) SCALE Source: http://www.mactalk.com.au/content/how-social-gaming-app-draw-something-blew-up-infographic-2245/ 1ms
  9. Twitter: @darachennis Trade Something •  3M North America Equity Market

    Data per second •  200K European Equity Market Data per second •  ~2K Spot Currency Market prices per second Streaming Mobile Data – Speed <100us
  10. Twitter: @darachennis BIG DATA … smarts (people, things, interactions) at

    scale? “Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.” Source: Edd Dumbill, Forbes - http://onforb.es/A6YE5o
  11. Twitter: @darachennis What is Big Data? •  Volume, Velocity, Variety?

    – … Whatever! •  Bigness + Structure Bigness: Volume, Velocity, Size Structure: Variety, Variability, Complexity •  Other? – Linked? Polystructured? Volatility? Veracity? … Whatever! Source: Bigness + Structure. Curt Monash - http://bit.ly/zE2flF Whatever? Colin Clark - http://bit.ly/rVPOjC
  12. Twitter: @darachennis A US Cap Market second? •  174 microseconds

    round trip time rules out High Frequency Trading applications. Not on the critical path! Source: Me, former life @StreamBase •  http://slidesha.re/guZOVe
  13. Twitter: @darachennis Throughput vs Latency •  Calculate array of 1

    million Black Scholes Merton put and call option prices •  GPU – calc 1M put & 1M call prices in a single batch, on a single host thread •  CPU – calc 1 put & 1 call in a single step on a single thread •  GPU wins w.r.t. Throughput hands down. CPU wins w.r.t. Latency hands down.
  14. Twitter: @darachennis Keep-Alive vs AJAX vs Comet vs Streaming :(

    :( :( :( :( :( :( :( Polling Rx1 Rx2 Rx3 Rx4 Tx1 Tx2 Tx3 Tx4 Rx1 Rx2 Tx1 Tx3 Tx4 Streaming :) :) :) :) :) :) Tx2 Rx3 :) 2800.00% 2720.00% 847.53% 839.15% 670.45% 865.41% 598.00% 644.46% 192.85% 157.21% 129.56% 161.98% 1.00% 10.00% 100.00% 1000.00% 10000.00% Iframe' XHR' Flash' WebSockets' SilverLite' Na:ve' Size'(KB)' Avg'Rate'(Kbps)' Stdev' Peak'Rate' 100.00$ 97.14$ 30.27$ 29.97$ 23.94$ 30.91$ 92.79$ 100.00$ 29.92$ 24.39$ 20.10$ 25.13$ 0.00$ 20.00$ 40.00$ 60.00$ 80.00$ 100.00$ 120.00$ Iframe' XHR' Flash' WebSockets' SilverLite' Na:ve' Max'Size'%' Max'Avg'%' Max'Stdev'%' Max'Peak'%' Synchronous Periodic Delayed Fat HTTP REST WebSockets Sockets Pub/Sub Pagination Fragmentation Snapshot/Delta Compression Resources
  15. Twitter: @darachennis STREAMING Messaging? It’s about Smart Data Distribution CEP?

    It’s about sensing & responding to opportunities & threads just in time Stream Oriented Messaging
  16. Twitter: @darachennis What is CEP (or ESP)? “Complex event processing

    is a new technology for extracting information from distributed message-based systems” Dr. David Luckham & Brian Frasca Program Analysis and Verification Group, Computer Systems Lab Stanford University August 18, 1998 Source: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.876 4 aspects: 1.  A DSL 2.  Continuous Query 3.  Windows 4.  Combinators
  17. Twitter: @darachennis SOM - Publish/Subscribe A/V Struc Unstruc Semi A

    B C D Data Sources Publisher Services Edge Publishers Endpoints /Services/VideoOnDemand /Services/LiveBroadcasts /Services/NewsPress /Services/NewsSocial /Services/MarketData /Services/MarketAnalytics /Services/AnalystReports /Services/AnalystReports
  18. Twitter: @darachennis Any Device WebSockets Flash diffusion.js native native mobile

    or enterprise out of the box on the web and mobile web } Bespoke cascading - Have your cake and eat it Cascading
  19. Twitter: @darachennis Smarts Diffusion Connectors Inbound Thread Pool Outbound Thread

    Pool Security :- Authorisation & Permissions Client Management Topic Management Message Management State Management Publishers: System and user defined System Monitoring Web Service User Services Media Service QoS SLA Telemetry Compression Throttling Publish/Subscribe Topic Aliasing Hierarchic Topic Space Snapshot Delta + - with: bang for the byte without: blah blah bloat optional essential Runtime Facilities Features
  20. Twitter: @darachennis Rich, Open Protocol Fragmented Rich Client APIs Performance:

    Ping Pong Guaranteed Delivery: ACKs Services: Command Subscription Management: Subscribe Unsubscribe Security: Credentials LifeCycle: Close Request Query: Fetch Messaging: Send Send Ack Required (Auto) Subscription Management: Subscribe Unsubscribe Security: Credentials Rejected LifeCycle: Topic Status Abort Notification ACK Paged Fragmented Query: Fetch Response Messaging: Topic Load Delta Client Server Fragmented ACK Paged Fragmented
  21. Twitter: @darachennis *Any* device •  Erlang Port of Client Protocol?

    2 days. •  Clients are ‘dumb renderers’ •  Arduino, Nanode, JeeLink – Via MQTT, 2 days.
  22. Twitter: @darachennis Use Cases •  Draw Something & Trade Something

    – Structured Data •  Drawing: Vector •  Prices, Orders, Trades: FIX Protocol
  23. Twitter: @darachennis Use Cases NoSQL – Docs, Values Push: Snapshots/Deltas

    FATty Skinny Big Data •  FAT data –  Why not just send a patch? •  Slim data –  Diffusion is data agnostic –  JSON –  Binary –  Structured •  Records & Fields –  Unstructured –  Video/Audio on Demand –  Live Video/Audio
  24. Twitter: @darachennis Compared to NoSQL docs? •  CouchBase – NoSql, Documents,

    Schemaless, KV – Integrity – MVCC based, ACID – Memcache (caching) •  Asynchronous persistence •  Working Set > Cache? Yes •  Tx? Not in the Encina XA sense. •  Clustering? Based on TAP protocol •  TAP can be (ab)used to stream CUD events
  25. Twitter: @darachennis Example: Forex FX External Feed Providers Raw Agg

    (BBO) x30 x30 x1 Tier xN App Dist www mobi int Portal Dist Internal Dist AJAX HTTP iOS/Android Native Java C++ .NET Native
  26. Twitter: @darachennis Example: Forex FX External Feed Providers Raw Agg

    (BBO) x30 x30 x1 Tier xN App Dist www mobi int Portal Dist Internal Dist AJAX HTTP iOS/Android Native Java C++ .NET Native Store & Forward MQ FX Provider EURUSD FX BBO EURUSD FX Tiers Tier 1 hub hub Dist FX EURUSD hub MQ store store store fwd fwd fwd
  27. Twitter: @darachennis Example: Forex FX External Feed Providers Raw Agg

    (BBO) x30 x30 x1 Tier xN App Dist www mobi int Portal Dist Internal Dist AJAX HTTP iOS/Android Native Java C++ .NET Native FX Provider EURUSD FX BBO EURUSD FX Tiers Tier 1 hop hop Dist FX EURUSD hop KV Store / Data Grid + Continuous Query. Flat ‘namespace’
  28. Twitter: @darachennis Example: Forex FX External Feed Providers Raw Agg

    (BBO) x30 x30 x1 Tier xN App Dist www mobi int Portal Dist Internal Dist AJAX HTTP iOS/Android Native Java C++ .NET Native Push. Brokerless, Data Listeners + Aliasing ~ CEP Operator FX Raw Agg (BBO) x30 x30 Tier xN App Dist www mobi int int
  29. Twitter: @darachennis Client: Bet365 [Gaming] In Play Δ Snapshot +

    Delta In Play In-Play Markets bet365 ~60 boxes Market Engines WWW Event Summary ∏ Summary At Close Final Score Markets ~300 boxes Market Engines WWW IIS Enable Real-Time Increase # Markets Collapse $ - N/W + Tin Increase # Conc. Clients + + + = Benefits
  30. Twitter: @darachennis Corp. Streams Big Data Share Event Streams WWW

    Mobile Internet of Things Event Streams Event Streams Flume TAP ??? ??? REST AJAX REST SOAP HTTP Get Put Pub Sub
  31. Twitter: @darachennis SUMMARY Shard it to scale Collocate for speed

    Data smarts is all nuance/tradeoffs Use RESTful for Resources Stop RPCing Streams Observe (Monitor) Orient (Measure) Decide (Just Do It) Act (+Telemetry) Smarts is about considered (nuance, tradeoff, advantage, disadvantage) of Bigness, Structure, Mobility …
  32. Copyright  Push  Technology  2012   [email protected]   Q&A?   Mobile

    Eye for the Big Data guy Twitter: @Push_Technology @darachennis