eating the world • Atwood’s Law: any application that can be written in JavaScript, will eventually be written in JavaScript. • REST APIs produce/consume data in JSON format (IoT, Mobile) • Application logs (feeding into Logstash or Elasticsearch) • Have you ever tried to work with (or read) XML? • A self-describing text format makes data very portable • Open Datasets (data.gov, datasf.org, data.cityofnewyork.us, Yelp) 4
Performing analytics on text files or string fields is slow • Variable schema data does not fit well with many formats • Need ANSI SQL support and useful extensions & functions for JSON 11
XML • No schema required — discovered on ingest • VARIANT data type • Optimized storage for both relational and complex types • Optimizations used across both formats 13
• We can have data pipelines with both self-defined and pre-defined schemas • We can have SQL on self-defining data and complex types (JSON, etc.) • We can be agile without building fragile systems — taking the best of both worlds • Have your JSON and SQL too! 17