‘Big Data’ approach ”Using scale-out techniques on commodity hardware in a schema-on- read fashion along with community-defined interfaces” • Volume: store all incoming sensor data for historical references • Variety: dozens of data formats in use in the IoT world, none is relational • Velocity: many devices generate data at a high rate; usually data streams
high-throughput, distributed, persistent publish-subscribe messaging system • Originates from LinkedIn • Typically used as buffer and routing layer in online stream processing http://kafka.apache.org/
column-oriented NoSQL database built on top of HDFS • Based on Google’s BigTable technology • Scales to 1,000s of commodity servers, billions of rows/ PB of data • Low-latency get/put operations http://hbase.apache.org/
Interactive analysis at scale with and without schema • Easy to support evolving structures of NoSQL data • Use with and without Hadoop https://www.mapr.com/blog/how-use-sql-hadoop- drill-rest-json-nosql-and-hbase-simple-rest-client
about development and testing? • synthetic sources • https://github.com/tdunning/log-synth • https://github.com/mapr-demos/gess • https://github.com/mapr-demos/direhose
distributed Time Series Database on top of HBase, enabling you … • to store & index, as well as • to query & plot … metrics at scale. http://opentsdb.net/
for smaller scales • Written in Go, no dependencies • Lots of client libs • Support for cluster op via Raft • Powerful, SQL-like query language select mean(value), percentile(90, value) as percentile_90 from /^stats.*/ group by time(10m) into 10m.:series_name http://influxdb.com
Data Platform • Deal with raw data natively • Support a range of workloads; streaming as first-class citizen • Ensure business continuity • Provide secure and privacy-aware operation https://www.mapr.com/blog/key-requirements-iot-data-platform
• Data volume, variety & velocity à not a good fit for RDBMS • Many open source tools available, iterate and scale as you go • If you need help re tooling à I’m around! • And last but not least …