& Subject to Change 45 The following is intended to outline our general product direction. We constantly innovate with our customers, and the specifics may change. Do not share this information without explicit permission from Databricks. "This information is provided to outline Databricks’ general product direction and is for informational purposes only. Customers who purchase Databricks services should make their purchase decisions relying solely upon services, features, and functions that are currently available. Unreleased features or functionality described in forward-looking statements are subject to change at Databricks discretion and may not be delivered as planned or at all"
(vCPU) Clusters • Abstracting away instance type selection • Simplifies cluster creation: users describe resource needs rather than pick instances • Optimized selection of spot instances to reduce interruptions • Instance capacity aware for improved availability AWS Azure GCP Current Status Private Preview Public Preview by end of Q4 TBD TBD
Tune for UC Managed Tables Complete table management for best performance out of the box Problem For optimal performance tables require periodic tuning operations. Solution Unity Catalog will auto tune tables. • For tables under 1TB, no tuning required (4x faster) • Auto Tune takes care of the data layout HOW? • Optimized Writes for Unpartitioned Tables • Asynchronous Auto Compaction • Ingestion Time Clustering AWS GCP Status GA In Q4 GA In Q4 BEFORE AUTO TUNE AFTER AUTO TUNE
Updates with Photon + Deletion Vectors Up to 10x speed-up in MERGE, UPDATE & DELETE queries w/ Photon Solution • Using Deletion Vectors, Photon avoids the need to rewrite files during MERGE, UPDATE & DELETE, speeding up updates by up to 10x. Problem • Updates to tables are expensive because of rewrites AWS GCP Q1 - Gated Public Preview Q1 - Gated Public Preview
when SLA is not important Good starting point for BI style workloads Step up as you need lower latency or intensive workloads Cluster size 2X-Small X-Small Small Medium Large X-Large 2X-Large 3X-Large 4X-Large Azure Driver size E8ds_v4 E8ds_v4 E16ds_v4 E32ds_v4 E32ds_v4 E64ds_v4 E64ds_v4 E64ds_v4 E64ds_v4 AWS Driver size i3.2xlarge i3.2xlarge i3.4xlarge i3.8xlarge i3.8xlarge i3.16xlarge i3.16xlarge i3.16xlarge i3.16xlarge Worker count 1 2 4 8 16 32 64 128 256 vCPU 8 8 16 32 32 64 64 64 64 Memory (GiB) 61 61 122 244 244 488 488 488 488 Instance Storage (GB) 1 x 1900 NVMe 1 x 1900 NVMe 2 x 1900 NVMe 4 x 1900 NVMe 4 x 1900 NVMe 8 x 1900 NVMe 8 x 1900 NVMe 8 x 1900 NVMe 8 x 1900 NVMe Networking Bandwidth (Gbps) Up to 10 Up to 10 Up to 10 10 10 25 25 25 25 vCPU 8 8 16 32 32 64 64 64 64 Memory: GiB 64 64 128 256 256 504 504 504 504 Max NICs 4 4 8 8 8 8 8 8 8 Networking Bandwidth (Gbps) 4 4 8 16 16 30 30 30 30 The instance size of all workers is i3.2xlarge on AWS The instance size of all workers is Standard_E8ds_v4 on Azure On Azure each driver and worker has 2 128 GB Standard LRS managed disks attached. Attached disks are charged hourly.