Slide 1

Slide 1 text

Unlocking the Power of Databricks SDK Serge Smertin, Practice Lead, Engineering, Databricks

Slide 2

Slide 2 text

Product safe harbor statement This information is provided to outline Databricks’ general product direction and is for informational purposes only. Customers who purchase Databricks services should make their purchase decisions relying solely upon services, features, and functions that are currently available. Unreleased features or functionality described in forward-looking statements are subject to change at Databricks discretion and may not be delivered as planned or at all.

Slide 3

Slide 3 text

©2022 Databricks Inc. — All rights reserved 3 About Serge ▪ Using Apache Spark since ~2015 ▪ At Databricks since 2019 ▪ Created Databricks Terraform Provider ▪ Author of Databricks SDKs ▪ Years in cybersecurity and payments before that

Slide 4

Slide 4 text

Databricks REST APIs

Slide 5

Slide 5 text

5 540 methods across 80 services 10 auth types Across 3 clouds 300 URLs Across 2 endpoints (Account APIs and Workspace APIs) 73 iterators Across 7 different pagination types

Slide 6

Slide 6 text

6

Slide 7

Slide 7 text

Databricks Software Development Kits

Slide 8

Slide 8 text

2. Complete 3. Consistent 4. Unified Authentication 5. Minimal dependencies

Slide 9

Slide 9 text

9 10 OSS projects 5 Databricks products so far 1.5M downloads per week C#, R, Rust Coming in Databricks Labs 3 core SDK Python, Go, Java

Slide 10

Slide 10 text

1 Integrate with the established ecosystem Advanced workflow orchestration straight from notebooks Build more tools - small and large Data Engineering doesn’t end within the workspace, but has to be integrated into an existing web application, which is written in Java or Python. Python SDK is the perfect utility for building advanced workflows. Administrators and Site Reliability Engineers often need to automate their work - either through shell scripts or Kubernetes extensions.

Slide 11

Slide 11 text

1 Create workspaces, catalogs, metastores Enforce permissions for data and workspace objects Guarantee environment portability across dev/staging/prod One of the components for disaster recovery Terrafor m Integrate with the existing deployment systems Create custom workflows around Databricks using your favorite programming language Create new web services that call Databricks APIs - it’s now easier with OAuth Use in the OSS projects SDK Ad-hoc experimentation Shell one-liners Managing local authentication profiles Deploying projects with Application Bundles Synchronising notebooks from IDE to Databricks Workspace CLI

Slide 12

Slide 12 text

12 Designed to work with IDE: Autocompletion, Service discovery, Documentation

Slide 13

Slide 13 text

1

Slide 14

Slide 14 text

Databricks Connect V2 Terraform Provider VSCode Extension Open-Source Ecosystem Other enterprise apps 1 Public Docs Python SDK Java SDK JavaScript SDK C#, R, Rust … CLI Go SDK Databricks SDK Generator API SPECIFICATION Testing and example verification infrastructure

Slide 15

Slide 15 text

1 Python SDK: same code just works in notebook, IDE, and CI/CD setup!

Slide 16

Slide 16 text

Development Production or CI Authenticate through environment variables. Leverage Kubernetes secrets and/or CI runner secret redaction. Authenticate through Databricks CLI, Azure CLI, Visual Studio Code, or in Databricks Notebooks. $ az login $ databricks auth login $ export DATABRICKS_HOST=... $ export ARM_CLIENT_ID=... $ export ARM_TENANT_ID=... $ export ARM_CLIENT_SECRET=... $ python3 run-app.py from databricks.sdk import WorkspaceClient w = WorkspaceClient()

Slide 17

Slide 17 text

Want to use SSO with webapps? Easy! You just need a way to store credentials in a session. SDK will help with the rest.

Slide 18

Slide 18 text

Consistent iteration interfaces

Slide 19

Slide 19 text

1 Iterators Focus on iterating 73 different entities, not 7 different pagination types

Slide 20

Slide 20 text

2 Long Running Operations

Slide 21

Slide 21 text

2 Long running operations Repetitive code pre-generated for you

Slide 22

Slide 22 text

Logging Safely and conveniently debug your API requests, when you troubleshoot. With secret values redacted away. Multi-threading friendly.

Slide 23

Slide 23 text

dbutils Call Databricks Utilities natively outside of notebooks

Slide 24

Slide 24 text

# Go $ go get github.com/databricks/databricks-sdk-go@latest # Python $ pip install databricks-sdk # Java com.databricks databricks-sdk-java 0.1.0

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Learn more at the summit! • We kindly request your valuable feedback on this session. • Please take a moment to rate and share your thoughts about it. • You can conveniently provide your feedback and rating through the Mobile App. Tells us what you think What to do next? • Visit the Learning Hub at the Databricks Zone! • Take complimentary certification at the event; come by the Certified Lounge • Visit our Databricks Learning website for more training, courses and workshops! databricks.com/learn Get trained and certified • Discover more related sessions in the mobile app! • Visit the Demo Booth: Experience innovation firsthand! • More Activities: Engage and connect further at the Databricks Zone! Databricks Events App

Slide 27

Slide 27 text

1_DAIS_Title_Slide Thank you.