Slide 1

Slide 1 text

C R A T E . I O Using Grafana and CrateDB Processing Your Machine Data Paul Adams | Engineering Manager

Slide 2

Slide 2 text

what is cratedb? CrateDB is an open source, distributed SQL database with integrated search that makes it simple to store and analyze massive amounts of structured and unstructured data in real-time. We love data: • Time Series • Geospatial • BLOBS • (Un)structured

Slide 3

Slide 3 text

How did we get here?

Slide 4

Slide 4 text

2014 Founding Mission: A SQL DB for the Machine Data Era Market need: • A machine data analytics database for the SQL mainstream Key design elements: • Millions of data points/second • Real time SQL query performance • Time series, geo, predictive, search… • Ad-hoc & aggregations! • Any structured or unstructured data • Simple scaling • SQL access 20% 15% 20% 35% 10% 5,200 0 E x a b y t e s Source (data in 2020 only): EMC/IDC report 2012

Slide 5

Slide 5 text

Built on a decade of recent innovations… 2003-2006 2014… 2007 - 2014 • Smart devices/mobile apps • Cloud computing • NoSQL • DevOps • Microservices/containers ANALYTIC DB IS BORN MORE TECH INNOVATION ANALYTIC DATABASES STAGNATE SQL FOR MACHINE DATA & REAL-TIME ANALYTICS • > 10x faster & cheaper • Distributed DB • Columnar DB • Scale-out & Unstructured for mainstream • Microservices architecture • Multi-model 1st SQL Disruption 2nd SQL Disruption last decade of innovations Bigtable

Slide 6

Slide 6 text

CrateDB Combines Best of All Worlds Perfect for workloads with: • Multi-structured data • High velocity INSERT • Fast queries • Ad hoc • Time series • Geospatial • Aggregates • Scale-out architecture • Fault-tolerant • BI integration All easy enough for SQL mainstream to operate Enabled by distributed query and micro services architecture innovations SQL SEARCH OPEN
 SOURCE

Slide 7

Slide 7 text

Open Machine Data Stack • Integrates easily • Low learning curve • Greatest flexibility • No lock in Apps DB Input Metabase C R A T E

Slide 8

Slide 8 text

Crate DB
 Features & Architecture

Slide 9

Slide 9 text

CrateDB Key Ideas • Distributed SQL for scale out • Simple scalability • Masterless, shared-nothing
 microservices architecture • Auto-sharding & partioning • Realtime search & aggregations • Columnar, multi model • Dynamic schema • Timeseries, Geospatial support • In-memory speed

Slide 10

Slide 10 text

SQL features: • ANSI SQL (subset with extensions) • JOINs (fully distributed) • Arrays and nested objects • Different types • Information Schema • Cluster/Node state exposed via tables • Partitioned tables • Geospatial support • Full text search • Powerful text processing
 • Subselects (en route) • Streaming support • Common relational Operators: • Projection • Grouping (incl. HAVING) • Aggregations Sorting Limit/Offset • WHERE-clause • Import/Export

Slide 11

Slide 11 text

Graphical
 Admin UI + SHELL

Slide 12

Slide 12 text

NYC 311 + Cabs

Slide 13

Slide 13 text

What is 311? • Source of government information and non- emergency services: • Heating • Sanitation • Potholes • ~24million incidents reported since 2009 • 100+million calls • Data published monthly!

Slide 14

Slide 14 text

shall we explore?

Slide 15

Slide 15 text

CrateDB + grafana

Slide 16

Slide 16 text

createdb data source • Version 0.2.0 now available! • Implementation by raintank GrafanaLabs (thanks!) • Feedback appreciated • Support for all of CrateDB • Data types • Scalar functions: avg, min, max, abs etc

Slide 17

Slide 17 text

time series machine data

Slide 18

Slide 18 text

Gantner q.brixx + station • Data Acquisition Device • Readings to 100khz • Output direct to CrateDB • Designed for industrial and test engineering environments • Voltage, Current, Temperature, Acceleration… • The kind of thing the TSA loves to find in your hand luggage

Slide 19

Slide 19 text

C R A T E . I O Thank you [email protected] @therealpadams