【Elixir】Dataloaderを導入してGraphQLのN+1問題を解消する

Dataloader を導入してGraphQL のN+1 問題を解消する 2022/11/14 ElixirImp#26 ：「Elixir 生誕10 周年」を締めくくる大LT 会（お仕事系もOK
）

About Me @koga1020_ @koga1020 koga1020.com 👨‍💻 自己紹介古賀祥造（koga1020 ）
福岡在住のソフトウェアエンジニア fukuoka.ex 管理人 💡 最近の興味関心 Elixir ・Phoenix を使ったWeb アプリケーション開発マイクロサービスの実現。実装パターンの学習美味しいご飯・美味しいお酒 🍶 マイホームで快適に過ごすこと🏠

PR: 「WEB+DB PRESS Vol.131 」に " はじめてのElixir" 特集が掲載されました🎉 2022/10/22 発売のWEB+DB
PRESS Vol.131 に掲載 Vol.127 で掲載されたPhoenix 特集に続き、Elixir の入門特集 Elixir コミュニティの有志6 名による共著で6 章の章立て Elixir 入門 - Erlang との関係, 基本操作, データ型関数 - 無名関数, 名前付き関数, モジュールパターンマッチ - 変数のマッチ, リストのマッチ, タプルのマッチ制御構造 - 条件分岐, エラーハンドリング, マクロコレクション操作 - Enum モジュール, 内包表記, ストリームプロジェクト開発 - Mix プロジェクト,ExUnit,ExDoc 「Elixir 完全に理解した」までに到達するのにオススメ！未チェックの方はぜひチェックしてみてください https://gihyo.jp/magazine/wdpress/archive/2022/vol131

今日のテーマ以下の記事を題材にLT としてまとめてみます https://zenn.dev/koga1020/articles/14a49472394b22

お品書き GraphQL とN+1 問題 Dataloader とは Absinthe での利用イメージ以下は時間の都合上、話せそうにないです、ゴメンナサイ🙇 GraphQL
そもそもの仕様の詳細説明 Absinthe の基本的な使い方

GraphQL とN+1 問題

GraphQL ではN+1 が起きやすい例. 投稿の一覧を紐づいているコメントと共に取得する DataLoader などを利用しない素朴なクエリ実行ではresolver は各field ごとに順次実行される投稿の取得クエリ：1
回各投稿に紐づくコメントの取得クエリ：投稿の数(N) 回実行 GraphQL の構造上、何も対策をしないとすぐにN+1 問題にぶつかる SELECT * FROM comments WHERE post_id = 1 SELECT * FROM comments WHERE post_id = 2 SELECT * FROM posts Comment Comment Comment Comment Comment Post Post Post root query { posts { title body comments { name body } } }

どうやったら回避できるか「Production Ready GraphQL 」より事前にデータを読み込んでおき(look ahead) 、resolver から参照する方法 →
確かにN+1 は回避できるが、client 側が利用するfield を決定するGraphQL では厳しい → 今ではDataloader と呼ばれるアプローチが主流になっている Now that we see the problem, what can we do about this? There are multiple ways to look at the problem. The first one is to ask ourselves if we could not find a way to load data ahead of time, instead of waiting for child resolvers to load their small part of data. In this case, this could mean for the friends resolver to “look ahead” and see that the best friend will need to be loaded for each. It could then preload this data and each bestFriend resolver could simply use a part of this preloaded data. This solution is not the most popular one, and that’s understandable. A GraphQL server will usually let clients query data in the representation they like. This means our loading system would need to adapt to every single scenario of data requirements that could appear very far into a query. It is definitely doable, but from what I’ve seen so far, most solutions out there are quite naive and will eventually break in very complex data loading scenarios. Instead, the more popular approach at the moment is one that is commonly called “DataLoader”. This is because the first implementation of this pattern for GraphQL was released as a JavaScript library called DataLoader. [1] 1. https://book.productionreadygraphql.com/

Dataloader とは js での実装がリファレンス実装として公開されている DataLoader is a generic utility to
be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching. データソースからのデータ取得にあたって「バッチ処理機構」と「キャッシュ機構」を提供する仕組み主なユースケースとしてGraphQL があるというだけで、GraphQL のためだけの仕組みではないというのがポイント💡 このリファレンス実装をもとに、他の言語では各言語の仕様に合わせてOSS が提供されている Elixir だとabsinthe-graphql/dataloader で提供されている README にも「facebook/dataloader にインスパイアされて実装し、Elixir に適するように変更を加えた」と書いてある Dataloader provides an easy way efficiently load data in batches. It’s inspired by https://github.com/facebook/dataloader, although it makes some small API changes to better suit Elixir use cases. [1] 1. https://github.com/graphql/dataloader

A GraphQL query execution using lazy loaders バッチ処理のイメージ resolver からデータストアに直接アクセスするのではなく、間にloader
を挟む loader はデータを取得するための識別子(e.g. ID) を収集して、後ほどまとめて読み込む(batch) queue に貯めたあと、いつバッチ関数が実行されるかは言語仕様や用途による Node.js: process.nextTick() Elixir: load/4 でsource となる構造体にキューを溜め、 run/1 で実行 GraphQL の場合はAbsinthe のplugin として run/1 の実行を挟む ` `[1] ` ` ` ` ` ` [2] 1. https://github.com/graphql/dataloader/issues/180 2. https://xuorig.medium.com/the-graphql-dataloader-pattern-visualized-3064a00f319f

余談：Elixir でdataloader を利用せずにバッチ処理を実装する batching のドキュメントでは、 Absinthe.Middleware.Batch を使った実装も紹介されているが、Ecto を利用する場合にテストが厳しかったり、Ecto
のDSL に沿った抽象化がなかったりと辛い点もある Absinthe でGraphQL サーバーを実装する場合は基本的にはDataloader を採用しておくのが無難か参考：https://sevenseacat.net/posts/2021/querying-batches-with-absinthe/ [1] ` ` 1. https://hexdocs.pm/absinthe/batching.html

サンプルの紹介

Dataloader 導入前 defmodule DataloaderSampleWeb.Schema do # ... 略 object :post
do field :id, non_null(:id) field :title, non_null(:string) field :body, non_null(:string) field :comments, non_null(list_of(:comment)) do resolve(fn post, _, _ -> # 各post からcomments を毎回取得している # これだとN+1 が発生する comments = Ecto.assoc(post, :comments) |> DataloaderSample.Repo.all() {:ok, comments} end) end end # ... （中略） query do field :posts, non_null(list_of(non_null(:post))) do resolve(&BlogResolver.list_posts/3) end end end

Dataloader 導入後 defmodule DataloaderSampleWeb.Schema do # Helpers をimport 。dataloader/1 を呼べるようにする
import Absinthe.Resolution.Helpers object :post do # ... 中略 field :comments, non_null(list_of(:comment)) do # dataloader/1 を実行。Dataloader.add_source/3 の第2 引数に指定したものと同じ値(source) を指定 resolve(dataloader(Blog)) end end # ... 中略 # context/1 を追加（Absinthe.Schema マクロでdefoverridable に指定されている） def context(ctx) do loader = Dataloader.add_source(Dataloader.new(), Blog, Blog.data()) Map.put(ctx, :loader, loader) end # plugins/0 を追加（Absinthe.Schema マクロでdefoverridable に指定されている） def plugins() do [Absinthe.Middleware.Dataloader] ++ Absinthe.Plugin.defaults() end end

add_source/3 に渡すdata source の実装 defmodule DataloaderSample.Blog do # ... 略
def data() do # Dataloader.Ecto.new/2 でDataloader.Ecto 構造体を生成 Dataloader.Ecto.new(DataloaderSample.Repo, query: &query/2) end # query/2 関数のパターンを増やすことでクエリ実行時の条件分岐を実装可能 def query(queryable, _params) do queryable end # 例. def query(Post, %{has_admin_rights: true}), do: Post def query(Post, _), do: from p in Post, where: is_nil(p.deleted_at) def query(queryable, _), do: queryable # ... end

実装のざっくりまとめ context にloader を追加してAbsinthe.Middleware での処理時にloader を参照できるようにする loader に対してデータソース(source) を追加する Phoenix
だとContext ごとにsource を作ると良い In a Phoenix application you’ll generally have one source per context, so that each context can control how its data is loaded. plugins に追加してResolution 時にバッチ処理が実行されるようにする resolver をdataloader のヘルパー関数を用いて実装する Dataloader.Ecto.new でsource を作成する query オプションを変更して実行されるクエリを制御する source≒queue と捉えるとしっくりくるかも？ Phoenix であればContext ごとにqueue を作り、resolution のタイミングでqueue がflush される(= バッチが実行される) イメージ [1] [2] ` ` ` ` 1. https://hexdocs.pm/absinthe/dataloader.html 2. 実際に実装を見るとイメージが湧きやすい

まとめ GraphQL サーバー実装するときはN+1 問題によく直面する回避策としてDataloader というバッチ処理とCache の機構が考えられていて、多数の言語で提供されている Elixir だとAbsinthe が対応している
association name に基づいてよしなに解決してくれるなど、Ecto の実装と親和性があるバッチ処理時に取得する際のクエリについても拡張可能

Appendix 英語だが以下の2 冊は大変勉強になったので興味ある方はぜひ Craft GraphQL APIs in Elixir with Absinthe
Absinthe の実装についてまとまっている Production Ready GraphQL Elixir によらない、GraphQL 自体の勘所がまとまった書籍もちろんdataloader の話も載っている網羅的に学習できたのでオススメ

【Elixir】Dataloaderを導入してGraphQLのN+1問題を解消する

【Elixir】Dataloaderを導入してGraphQLのN+1問題を解消する

shozo koga

More Decks by shozo koga

Other Decks in Programming

Featured

Transcript

Dataloader を導入してGraphQL のN+1 問題を解消する 2022/11/14 ElixirImp#26 ：「Elixir 生誕10 周年」を締めくくる大LT 会（お仕事系もOK

About Me @koga1020_ @koga1020 koga1020.com 👨‍💻 自己紹介古賀祥造（koga1020 ）

PR: 「WEB+DB PRESS Vol.131 」に " はじめてのElixir" 特集が掲載されました🎉 2022/10/22 発売のWEB+DB

今日のテーマ以下の記事を題材にLT としてまとめてみます https://zenn.dev/koga1020/articles/14a49472394b22

お品書き GraphQL とN+1 問題 Dataloader とは Absinthe での利用イメージ以下は時間の都合上、話せそうにないです、ゴメンナサイ🙇 GraphQL

GraphQL とN+1 問題

GraphQL ではN+1 が起きやすい例. 投稿の一覧を紐づいているコメントと共に取得する DataLoader などを利用しない素朴なクエリ実行ではresolver は各field ごとに順次実行される投稿の取得クエリ：1

どうやったら回避できるか「Production Ready GraphQL 」より事前にデータを読み込んでおき(look ahead) 、resolver から参照する方法 →

Dataloader とは js での実装がリファレンス実装として公開されている DataLoader is a generic utility to

A GraphQL query execution using lazy loaders バッチ処理のイメージ resolver からデータストアに直接アクセスするのではなく、間にloader

余談：Elixir でdataloader を利用せずにバッチ処理を実装する batching のドキュメントでは、 Absinthe.Middleware.Batch を使った実装も紹介されているが、Ecto を利用する場合にテストが厳しかったり、Ecto

サンプルの紹介

Dataloader 導入前 defmodule DataloaderSampleWeb.Schema do # ... 略 object :post

Dataloader 導入後 defmodule DataloaderSampleWeb.Schema do # Helpers をimport 。dataloader/1 を呼べるようにする

add_source/3 に渡すdata source の実装 defmodule DataloaderSample.Blog do # ... 略

実装のざっくりまとめ context にloader を追加してAbsinthe.Middleware での処理時にloader を参照できるようにする loader に対してデータソース(source) を追加する Phoenix

まとめ GraphQL サーバー実装するときはN+1 問題によく直面する回避策としてDataloader というバッチ処理とCache の機構が考えられていて、多数の言語で提供されている Elixir だとAbsinthe が対応している

Appendix 英語だが以下の2 冊は大変勉強になったので興味ある方はぜひ Craft GraphQL APIs in Elixir with Absinthe