Slide 1

Slide 1 text

©2024 Databricks Inc. — All rights reserved BUILDING DATABRICKS INTEGRATIONS WITH GOLANG Serge Smertin 1

Slide 2

Slide 2 text

©2024 Databricks Inc. — All rights reserved 2 CAN BE EVERYTHING: FROM INFRASTRUCTURE TO APPLIED LLM USE

Slide 3

Slide 3 text

©2024 Databricks Inc. — All rights reserved ©2022 Databricks Inc. — All rights reserved 3 About Serge ▪ At Databricks since 2019 ▪ Author of Databricks Terraform Provider (written in Go) ▪ Author of Databricks SDK for Go (and Python, Java, …) ▪ Initiated replatforming of Databricks CLI into Go (from Python) ▪ Driving Databricks Labs ▪ With love-hate attitude towards GoLang since 2020

Slide 4

Slide 4 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 4 - RELEASE NOTES WITH LLMS - CLEANUP OF DATABRICKS WORKSPACES

Slide 5

Slide 5 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 5 - ~50 REPOSITORIE S

Slide 6

Slide 6 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 6

Slide 7

Slide 7 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved

Slide 8

Slide 8 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved UNRELEASED COMMITS GIT DIFF PER COMMIT SPLIT DIFF PER FILE SUMMARIZE FILE CHANGE SUMMARIZE ALL CHANGES WRITE CHANGELOG.MD WRITE ANNOUNCEMENT 1 2 3 4 5 6 7

Slide 9

Slide 9 text

©2024 Databricks Inc. — All rights reserved 9 var fileDiffTemplate = MessageTemplate(` Here is the commit message terminated by --- for the context: {{.Message}} --- Do not hallucinate. You are Staff Software Engineer, and you are reviewing one file at a time in a unified diff format. Do not use phrases like "In this diff", "In this pull request", or "In this file". Do not mention file names, because they are not relevant for the feature description. If new methods are added, explain what these methods are doing. If existing functionality is changed, explain the scope of these changes. Please summarize the input as a single paragraph of text written in American English. Your target audience is software engineers, who adopt your project. If the prompt contains ordered or unordered lists, rewrite the entire response as a paragraph of text. `) func (lln *llNotes) Commit(ctx context.Context, commit *github.RepositoryCommit) (History, error) { … err := lln.http.Do(ctx, "GET", fmt.Sprintf("https://github.com/%s/%s/commit/%s.diff", lln.org, lln.repo, commit.SHA), httpclient.WithResponseUnmarshal(&buf)) var httpErr *httpclient.HttpError if errors.As(err, &httpErr) && httpErr.StatusCode == 404 { return History{ AssistantMessage(fmt.Sprintf("Commit %s was not found", commit.SHA)), }, nil } tokens := strings.Split(commit.Commit.Message, " ") if len(tokens) > 15_000 { commit.Commit.Message = strings.Join(tokens[:15_000], " ") } return lln.explainDiff(ctx, History{ fileDiffTemplate.AsSystem(commit.Commit), }, &buf)

Slide 10

Slide 10 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved

Slide 11

Slide 11 text

©2024 Databricks Inc. — All rights reserved 11 func (lln *llNotes) Talk(ctx context.Context, h History) (History, error) { logger.Debugf(ctx, "Talking with AI:\n%s", h.Excerpt(80)) response, err := lln. w.ServingEndpoints.Query (ctx, serving.QueryEndpointInput{ Name: lln.model, Messages: h.Messages(), MaxTokens: lln.cfg.MaxTokens, }) if err != nil { return nil, fmt.Errorf("llm: %w", err) } for _, v := range response.Choices { h = h.With(AssistantMessage(v.Message.Content)) } return h, nil }

Slide 12

Slide 12 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 12

Slide 13

Slide 13 text

©2024 Databricks Inc. — All rights reserved 13 NO LANGCHAIN. PURE DATABRICKS REST API. ONE BINARY AS RESULT.

Slide 14

Slide 14 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved ~1 hour Explaining release and writing announcements ~7 minutes ~~> Removing LLM hallucinations and other minor edits … after two weeks of part-time effort in between other work and meetings I was able to reduce release time:

Slide 15

Slide 15 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved

Slide 16

Slide 16 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 16 HOW TO WIPE THOUSANDS OF TEST JOBS / USERS … AT SCALE

Slide 17

Slide 17 text

©2024 Databricks Inc. — All rights reserved 17

Slide 18

Slide 18 text

“static fixtures” are created by Terraform “dynamic fixtures” that require a cleanup

Slide 19

Slide 19 text

Databricks Labs Watchdog

Slide 20

Slide 20 text

©2024 Databricks Inc. — All rights reserved 20 MAINTAIN TEST ENVIRONMENTS CLEAN AND SECURE

Slide 21

Slide 21 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved LIST ALL WORKSPACES LIST 30 OBJECT TYPES VERIFY EXPECTED CONFIGS WAIT FOR ANY CI JOBS TO FINISH RANDOMIZED PARALLEL DELETE 1 2 3 4 5

Slide 22

Slide 22 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 22

Slide 23

Slide 23 text

©2024 Databricks Inc. — All rights reserved ©2024 Databricks Inc. — All rights reserved 23