Speaker Deck

NoSQL no more: SQL on Druid with Apache Calcite (Strata SJ 2018)

by Imply

Published March 7, 2018 in Technology

Druid is an analytics-focused, distributed, scale-out data store. Existing Druid clusters have scaled to petabytes of data and trillions of events, ingesting millions of events every second. Up until version 0.10, Druid could only be queried in a JSON-based language that many users found unfamiliar.

Enter Apache Calcite. It includes an industry-standard SQL parser, validator, and JDBC driver, as well as a cost-based relational optimizer. Calcite bills itself as “the foundation for your next high-performance database” and is used by Hive, Drill, and a variety of other projects. Druid uses Calcite to power Druid SQL, a standards-based query API that vaults Druid out of the NoSQL world and into the SQL world.