enechainが挑む、AIエージェントと共に進めるScheduled Queryからdbtへの移行

1 Coalesce on the Road Tokyo 2025

2021年5月から株式会社 enechainにJoin エネルギー需給や市場価格データ統合 /可視化サービス全社アプリ共通データパイプライン開発を経て現在 dbtを中心としたデータ活用促進や基盤改善を担当 About
me 周晟聡 (akizhou) Data Platform Desk データエンジニア

Table of contents 1. Challenge 2. Context Engineering 3. Workflow
as Structure 4. Additional Enhancements 5. Next Steps Agenda

Challenge / 課題 01

Legacy System / 既存システム SQL files defining transformation logic •
100+ models • Some models over 1000 lines • Many models not managed in code Cron-based scheduling for execution dependencies • Executions are managed by fixed time offset

Problems of Legacy System / 既存システム課題 > Data Platform
Team took on the responsibility to migrate this データプラットフォームチームで移行検討を開始 • No lineage visibility リネージが見えない • No documentation ドキュメントなし • Execution has no model dependency awareness モデル依存関係と実行が紐づいていない

Migration Challenge / 移行する上で課題 Ideal Reality Well documented models
ドキュメント化済み Limited documentation ドキュメント不足 Well shared business logic ロジック共有済み Tribal knowledge ロジック属人的 Schema/columns described スキーマ定義等あり No metadata メタデータなし The existing system contains many complex, person-dependent models built by few analysts 既存システム少人数データアナリストが多様な要件に対応する中で構築されたため、属人性が高い複雑なモデルが多数あった

1. Industry standard — already used internally 業界標準、社内でも既にdbt Coreを一部利用 2.
Lineage visibility — understand data flow リネージ可視化 3. Modularity — ref(), source(), macros モジュール性 4. Wide compatibility — DWH + BI tools 様々なDWH・BIツールと互換性 Why dbt?

Why dbt Platform? 1. Small team — no resource for
infrastructure setup and maintenance 少人数チーム — インフラ構築・メンテする余裕がない 2. dbt Catalog — documentation interface out of the box ドキュメント機能が充実している 3. Semantic Layer — centralized metrics メトリクス一元管理 4. dbt Mesh — cross-project collaboration プロジェクト間連携が可能

Migration Overview

No dbt expertise, no documentation dbt 専門知識なし、ドキュメントなし > Let AI
do the heavy lifting 大変な作業 AIに任せる > Allocate resources to review and learning レビューと学習に集中する Our Approach

AI tools used at enechain

AI tools we chose

Context Engineering / コンテキスト整備 02

Naive Approach / 最初試み

Context was too large and sparse → Context overload •
コンテキストにムラがあり、領域によって大きすぎ No guardrail provided → AI just guessed • ガードレイルを設定しなかったため、AIが勝手に推測 "Just figure it out" approach → Results were inconsistent • 「なんとかして」で結果に一貫性がなかった What went wrong?

dbt Documentation as Initial Context

1. AI scraped dbt best practices via web search ◦
AIがWebSearch Toolを使ってdbtベストプラクティスを収集 2. AI filtered irrelevant info based on our infrastructure (e.g. BigQuery-only) ◦ インフラに合わせて不要な情報をAIがフィルタリング 3. Human reviewed and curated the output 　人間が内容をレビューし厳選 How we created the Docs > The review process became our dbt education

Connecting Docs to AI Agents

1. AI Context ◦ Unified context, soft guardrails / 統一されたコンテキスト
2. Architecture Decision Records ◦ Design decisions documented / 設計記録が残る 3. Developer Handbook ◦ Team learning resource / チーム学習リソース 3 Purposes of the Docs / ドキュメント 3つ役割 > AI becomes the coach, not just a code generator

• AI started ignoring parts of the context ◦ AIがコンテキスト
一部を無視し始めた • "Read all" = AI interprets implicitly → inconsistent output ◦ 「全部読め」で AIが暗黙的に解釈し、出力にらつきが出る But this was not enough > Needed structure to guide AI to the right context for each task 最適なコンテキストを割り当てる構造が必要だった

Workflow as Structure / ワークフロー化 03

1. Quality assurance - human guardrails ◦ 品質保証 2. Reveal
hidden knowledge ◦ 暗黙的なコンテキストが引き出される 3. Prevents AI from going off track - reduces cost of mistake ◦ 手戻りコスト軽減 Human-in-the-Loop Design > Increase overall output predictability with human checkpoints 人間によるチェックポイントを設けることで全体予測可能性を高める

Migration as Workflow analyze-bq-query Human review generate-dbt-model plan-migration review-migration >
Each command only reads relevant docs for its task 各コマンドタスクに関連するドキュメントだけを参照

Intermediate Docs / 中間ドキュメント

Command 1: analyze-bq-query Steps: 1. Parse SQL（SQL解析） 2. Identify sources（ソース特定）
3. Extract logic（ロジック抽出） 4. Map dependencies（依存関係マッピング） Mandatory Context: 05_schema_and_evolution/migration_patterns.md Output: migration_analysis/{query}_analysis.md

# Analysis: daily_sales_summary ## Sources - `project.dataset.sales` (primary) - `project.dataset.products`
(joined) ## Transformations - Aggregation: SUM(amount) GROUP BY date - Filter: WHERE status = 'completed' ## Dependencies - Upstream: sales, products - Schedule: Daily 18:00 JST Analysis output format

Command 2: plan-migration Steps: 1. Map to dbt layers（dbtレイヤーへマッピング）
2. Recommend materialization（実体化戦略推奨） 3. Assess risks（リスク評価） Mandatory Context: 02_modeling/layer responsibilities, naming, materialization Output: migration_plans/{query}_plan.md

# Plan: daily_sales_summary → fct_daily_sales ## Layer: marts (fact table)
## Materialization: incremental ## Model Structure - stg_sales (staging) - int_sales_enriched (intermediate) - fct_daily_sales (mart) ## Risks - Large table: consider partitioning Migration plan output format

Command 3: generate-dbt-model Steps: 1. Convert syntax（構文変換） 2. Add ref(),
source()（参照追加） 3. Generate YAML + tests（YAML・テスト生成） Mandatory Context: 03_sql_and_quality/, 04_testing_and_documentation/ Output: SQL + YAML + test files

Command 4: review-migration Steps: 1. Run review sub-agents（subagent実行） 2. Aggregate
findings（結果集約） Mandatory Context: Each sub-agent reads only its relevant docs （各subagentが担当領域ドキュメントみ参照） Output: reviews/{model}_review.md

# Review: fct_daily_sales ## Code Quality: Pass - SQL style
follows guidelines ## dbt Best Practices: Warning - Consider adding partition_by config ## Test Coverage: Pass - PK tests present ## Business Logic: Pass - Matches original query output Review output format

Additional Enhancements / そ他改善点 04

Automated implementation made the review process the new bottleneck 実装
自動化により、レビューが新たなボトルネックと化した Bottleneck Shifted to Reviews

Bottleneck Shifted to Reviews During code generation • Split context
via separate commands for each workflow step 各ワークフロー工程ごとにコマンドを分割してコンテキストも分割 • Human reviews intermediate outputs 人間が中間成果物をレビュー Final output needs holistic review 最終成果物全体的なレビューが必要 • Multiple aspects to check → context overload again 複数観点をチェックする必要があり、再びコンテキスト過多になる • Sub-agents, each focused on one aspect 各観点専用サブエージェントを導入

Review Suagents: Minimal

Review Suagents: Balanced

Review Suagents: Comprehensive

While Claude Code is implementing... Use Cursor Ask mode to
• Read through generated documents • Understand generated models • Deepen knowledge of dbt > Don’t just wait, go through documents while AI works 待ち時間も学習 / レビューに Utilize Wait Time / 待ち時間活用

dbt Fusion's Ahead-of-Time analysis: • Validates SQL syntax, references, types
locally • dbt Core's parses Jinja/YAML only • dbt Fusion = full static SQL analysis without warehouse dbt-mcp + dbt Fusion > AI agents get structured feedback 構造化された情報が AI agentに渡される

dbt Fusion catches these without expensive warehouse calls: • Syntax
errors (missing comma, typos) • Invalid ref() or config() usage • Schema issues and column mismatches スキーマ問題やカラム不一致 • Project structure errors (missing sources) プロジェクト構造エラー（ソース未定義など） dbt-mcp + dbt Fusion

AI gets systematic feedback from dbt engine: • Errors caught
before human even sees the output 人間が見る前にエラーをキャッチ • No need to run models to find out they don't work モデルを実行して動かないと気づく必要がない dbt-mcp + dbt Fusion = Hard Guardrail Generated model Human review Context Iterative Feedback Loop Prompt Reference Validated model dbt-mcp Generated model Context Iterative Feedback Loop Prompt Reference Validated model Human review > AI gets systematic structured feedback from dbt BEFORE AFTER

• Running evaluator can be costly 評価ツール実行コストがかかる • For
now dbt Fusion as quick guardrails is sufficient dbt Fusionで結構まかなえる • Will introduce in refactoring phase リファクタリングフェーズで導入すれ良い Why not dbt_project_evaluator?

Reflection and Next Steps 振り返りと今後展望 05

• Insufficient context for AI → Documentation as context AI
コンテキスト不足 → ドキュメント整備 • Inconsistent AI output quality → Structured workflow + commands AI出力品質らつき → 構造化されたワークフロー • Review bottleneck → Subagents for multi-aspect review レビューがボトルネック → Subagentを利用して負荷軽減 • Need quick guardrails → dbt-mcp + dbt Fusion 手軽なガードレールが必要 → dbt-mcp + dbt Fusion Reflection / 振り返り

Post-Migration Focus → Refactoring and Optimization • Refactor Models: Ensure
strict adherence to dbt best practices ベストプラクティスに厳密に従うようリファクタ • Optimize Dependencies: Improve cross-model dependency efficiency モデル間依存関係を最適化 • Enforce Standards: Use the dbt project evaluator for compliance dbt project evaluatorで機械的にベスプラ準拠 Next steps / 今後展望

Table of contents 1. Invest in structure upfront 最初にコンテキスト、ワークフローなど整える 2.
Design for predictability over autonomy 要所で人間が介入して軌道修正 3. Learn by reviewing レビューしながら学ぶ Key Takeaways

enechainが挑む、AIエージェントと共に進めるScheduled Queryからdbtへの移行

enechainが挑む、AIエージェントと共に進めるScheduled Queryからdbtへの移行

Other Decks in Technology

Featured

Transcript