place to work in the world • Work at the intersection of market capitalization and technology – Growth companies: FileNet and Siebel – Value plays: Avaya and Openwave • Constantly figuring out what building blocks to use, how and where to get teams to use them 2
is finally bringing its server chips up to speed by introducing theSandy Bridge-based E5-2600 family of CPUs. The company claims its latest processors outperform the previous generation of Xeons by up to 80 percent in raw speed, while improving per-watt performance by 50 percent. A grand total of 17 different Xeons will be available, ranging in price from $198 to $2,050. The eight-core chips support up to 768GB of RAM, PCI Express 3.0, Hyper-Threading, Turbo Boost, Intel Virtualization -- basically the whole Chipzilla portfolio of tricks. We have plenty of CPU power and addressable memory 5
1 storage shelf) IOPS (TYPICAL 4K RANDOM) 300,000 200,000 SUSTAINED WRITE IOPS 180,000 140,000 BANDWIDTH 3 GB/sec 2 GB/sec SUSTAINED WRITE BANDWIDTH 1GB/sec 500 MB/sec LATENCY < 1 ms average latency < 1 ms average latency EFFECTIVE CAPACITY* - AT 5-TO-1 DATA REDUCTION - AT 10-TO-1 DATA REDUCTION Up to 100 TB Up to 200 TB Up to 50 TB Up to 100 TB To produce serious throughput 7
limited to tables, columns • Table and column names are in metadata Effect: Small namespaces • Refer to metadata to extract working data (join, select) • Working data limited in size 14
Releases Oracle TimesTen In-Memory Database 11g Release 2. • SAP: HANA: The Next Wave of In-Memory Computing Technology New Patches • Memcached Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. 15
• Often no fixed table or column structure • Usually weaker consistency • Often optimized for append and retrieve • Exploits larger memory, distributed storage • Takes us past historic limits • But there is no uniform approach 16
high-throughput access • HBase™: A scalable, distributed database that supports structured data • Cassandra™: A scalable database with no single points of failure. • MongoDB™: A NoSQL database that optimizes writes • Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying. 18
controversy – http://www.youtube.com/watch?v=b2F-DItXtZs 20 Note: A lot of big data concepts come from Amazon, Google, and Facebook. MySQL is a free relational database.
for short-term market traction • Emulate an old interface, and your startup gets volume – But they are not optimal at a system level • Old interfaces were designed for old constraints – Will ultimately yield to completely new software architectures 21
processor – IO times drop from 2 milliseconds to 20 microseconds • 2000 microseconds down to 20 – 100x improvement • There is processing power close to the device • Several “startups” fit flash to historic interfaces – XIO – Fusion IO – Pure Storage 25
stores (typical in NoSQL databases) go if their indices could be updated in non-volatile memory and not block while waiting on kernel I/O? Fusion IO’s Brent Compton, Blog, Jan 18 2012 30
– It’s deployed in specialized controllers – It’s available for general processing • Storage is a problem – But Flash brings it closer to the CPU • But we are limited by legacy application structure and SQL – Block device architecture limits access to logic closer to storage – NoSQL offers paths – Middleware has yet to emerge • Major players have not kept up – Oracle, SAP, Microsoft will have to adapt or acquire • The world is full of opportunity 32
XIO – Pure Storage • NoSQL – Couchbase – RethinkDB • Collaboration – Jive – Moxie 44 • Communications – Aeris • SaaS IT – Mobile Iron – Zenprise • SaaS VOIP – Ring Central • Big Data Apps – C3