$30 off During Our Annual Pro Sale. View Details »

Active Record Query Quiz

pocke
October 27, 2023

Active Record Query Quiz

pocke

October 27, 2023
Tweet

More Decks by pocke

Other Decks in Programming

Transcript

  1. Active Record Query Quiz
    Kaigi on Rails 2023
    27th, Oct. 2023

    View Slide

  2. pp self
    ● Masataka Pocke Kuwabara
    ● Work for Money Forward, Inc.
    ○ Rails engineer for クラウド会計Plus
    ○ A maintainer of RBS
    ● Live in Okayama
    Read it with the conference app→
    (app.kaigionrails.org)

    View Slide

  3. View Slide

  4. View Slide

  5. About this talk

    View Slide

  6. Goal
    ● You can choose the appropriate method for querying w/ AR
    BTW, this talk focuses on the Rails beginner. It may be boring for Rails experts.

    View Slide

  7. Quiz Rule
    ● I'll show you two kinds of code
    ○ using Active Record
    ○ doing the same things
    ● You choose the faster code
    ● Rails 7.1 / mysql2 adapter and MySQL 8.0 / ruby 3.3.0-dev

    View Slide

  8. Q1. Finding articles of blog_id = 42

    View Slide

  9. Q1. Finding articles of blog_id = 42
    # A
    Article
    .all
    .select { _1.blog_id == 42 }
    # B
    Article
    .all
    .where(blog_id: 42)
    .to_a

    View Slide

  10. Answer. Finding articles of blog_id = 42
    # A
    Article
    .all
    .select { _1.blog_id == 42 }
    # Article Load (2.4ms) SELECT
    `articles`.* FROM `articles`
    # B
    Article
    .all
    .where(blog_id: 42)
    .to_a
    # Article Load (1.1ms) SELECT
    `articles`.* FROM `articles` WHERE
    `articles`.`blog_id` = 42
    B is faster than A!
    💯
    󰢄

    View Slide

  11. Explanation: Finding articles of blog_id = 42
    ● Lazy evaluation
    ○ `all` and `where` do not execute SQL, but `to_a` does.
    ○ It is delayed until when the records are necessary
    ● Basically, RDBMS should do it if RDBMS can.

    View Slide

  12. Counterexample: Finding articles of blog_id = 42
    ● `select` can be faster if it's called several times
    ○ Because Active Record caches the result to `@records`
    # A
    articles = Article.all
    x = articles.select {
    _1.is_active
    }
    # Article Load (3.1ms) SELECT
    `articles`.* FROM `articles`
    y = articles.select {
    not _1.is_active
    }
    # nothing
    # B
    x = Article
    .where(is_active: true)
    .to_a
    # Article Load (1.5ms) SELECT `articles`.* FROM
    `articles` WHERE `articles`.`is_active` = TRUE
    y = Article
    .where(is_active: false)
    .to_a
    # Article Load (1.9ms) SELECT `articles`.* FROM
    `articles` WHERE `articles`.`is_active` = FALSE
    Note: Enumerable#partition is more appropriate.

    View Slide

  13. Pro tips
    To investigate what SQL is executed
    ● Use the debug log
    ● Use `AR::Relation#to_sql` method
    ○ irb(main):001> Article.all.where(blog_id: 42).to_sql
    => "SELECT `articles`.* FROM `articles` WHERE `articles`.`blog_id` =
    42"

    View Slide

  14. Q2. Checking existence

    View Slide

  15. Q2: Checking existence
    # A
    Article.all
    .exists?(blog_id: 42)
    # B
    Article.all
    .any? { _1.blog_id == 42 }

    View Slide

  16. Answer: Checking existence
    # A
    Article.all
    .exists?(blog_id: 42)
    # Article Exists? (0.9ms) SELECT 1
    AS one FROM `articles` WHERE
    `articles`.`blog_id` = 42 LIMIT 1
    # B
    Article.all
    .any? { _1.blog_id == 42 }
    # Article Load (1.7ms) SELECT
    `articles`.* FROM `articles`
    💯 󰢄
    A is faster than B!

    View Slide

  17. Explanation: Checking existence
    ● Lazy execution, same as Q1.
    ● Active Record has specific methods such as exists?.
    ○ Use them instead of Enumerable's method
    ● Enumerable methods are useful, but it is not optimized for AR.
    ○ You can easily distinguish by that the method takes a block yielding an AR object
    ○ But…

    View Slide

  18. You can make any? faster by a hack…
    ● `any? { … }` needs records to execute the block
    ● On the other hand, it does not need records if it does not execute the
    block
    ● We have `RubyVM::AbstractSyntaxTree.of` to solve this problem!

    View Slide

  19. View Slide

  20. View Slide

  21. Commercial for
    Money Forward

    View Slide

  22. Money Forward
    ● Nov. 9: RubyWorld Conference 2023. https://2023.rubyworld-conf.org/ja/
    ○ I will talk about RBS.
    ○ At Matsue (registration closed😢) and Online
    ● Nov. 16: 各社の技術広報が明かす「RubyKaigiスポンサーの裏話」運営ノウハウ
    やコミュニティへの想い https://moneyforward.connpass.com/event/298325/
    ○ Acha-san will talk about sponsoring RubyKaigi
    ○ Online event
    ● We have an X account @moneyforwardDev!

    View Slide

  23. Weekly event to watch RubyKaigi 2023 videos
    I organize an event to watch RubyKaigi 2023 videos.
    ● In a zoom meeting.
    ● At 12:10-12:40 on every Thursday
    ● Accounced in #rubykaigi channel of ruby-jp slack.
    ● We'll watch "RubyGems on the watch / Maciej Mensfeld" as the next
    session in 2nd Nov.
    ● Feel free to join us!

    View Slide

  24. Back to the quiz…

    View Slide

  25. Q3. Querying exists? in loop

    View Slide

  26. Q3. Querying exists? in loop
    # A
    current_user = User.find(42)
    can_update = current_user
    .permissions
    .exists?(action: :update)
    response = Article.all.map do
    {
    title: _1.title,
    can_update:,
    }
    end
    # B
    current_user = User.find(42)
    response = Article.all.map do
    can_update = current_user
    .permissions
    .exists?(action: :update)
    {
    title: _1.title,
    can_update:,
    }
    end

    View Slide

  27. Answer: Querying exists? in loop
    # A
    current_user = User.find(42)
    can_update = current_user
    .permissions
    .exists?(action: :update)
    # Permission Exists? (0.8ms) SELECT 1 AS
    one FROM `permissions` WHERE
    `permissions`.`user_id` = 42 AND
    `permissions`.`action` = 'update'
    response = Article.all.map do
    {
    title: _1.title,
    can_update:,
    }
    end
    # B
    current_user = User.find(42)
    response = Article.all.map do
    can_update = current_user
    .permissions
    .exists?(action: :update)
    # Permission Exists? (0.8ms) SELECT 1
    AS one FROM `permissions` WHERE
    `permissions`.`user_id` = 42 AND
    `permissions`.`action` = 'update'
    {
    title: _1.title,
    can_update:,
    }
    end
    💯
    󰢄
    A is faster than B!

    View Slide

  28. Counterexample: Reduce the difference
    Active Record query cache can reduce
    the difference.
    We can confirm it with
    `ActiveRecord.cache do -- end`
    It is enabled by default in controller
    actions, but disabled in other places,
    such as job, rails console.

    View Slide

  29. Counterexample: Reduce the difference

    View Slide

  30. Counterexample: Reduce the difference
    Calculating -------------------------------------
    outside loop - no cache 192.676 (±15.1%) i/s - 4.712k in 25.056327s
    inside loop - no cache 4.416 (±22.6%) i/s - 109.000 in 25.288121s
    outside loop - cache 160.870 (±13.1%) i/s - 3.948k in 25.012053s
    inside loop - cache 39.832 (±10.0%) i/s - 984.000 in 25.079316s
    Comparison:
    outside loop - no cache: 192.7 i/s
    outside loop - cache: 160.9 i/s - same-ish: difference falls within error
    inside loop - cache: 39.8 i/s - 4.84x slower
    inside loop - no cache: 4.4 i/s - 43.63x slower

    View Slide

  31. Q4. Querying SELECT in loop

    View Slide

  32. Q4. Querying SELECT in loop
    # A
    categories = Category.all.to_a
    response = Article.all.map do
    {
    title: _1.title,
    categories:,
    }
    end
    # B
    response = Article.all.map do
    categories = Category.all.to_a
    {
    title: _1.title,
    categories:,
    }
    end

    View Slide

  33. Answer. Querying SELECT in loop
    # A
    categories = Category.all.to_a
    response = Article.all.map do
    {
    title: _1.title,
    categories:,
    }
    end
    # B
    response = Article.all.map do
    categories = Category.all.to_a
    {
    title: _1.title,
    categories:,
    }
    end
    💯
    󰢄
    A is faster than B!

    View Slide

  34. AR cache makes it faster…?
    Calculating -------------------------------------
    outside loop - no cache 206.570 (±18.4%) i/s - 1.007k in 5.050579s
    inside loop - no cache 2.511 (± 0.0%) i/s - 13.000 in 5.220488s
    outside loop - cache 208.327 (±17.8%) i/s - 1.020k in 5.065721s
    inside loop - cache 19.188 (±10.4%) i/s - 96.000 in 5.040611s
    Comparison:
    outside loop - cache: 208.3 i/s
    outside loop - no cache: 206.6 i/s - same-ish: difference falls within error
    inside loop - cache: 19.2 i/s - 10.86x slower
    inside loop - no cache: 2.5 i/s - 82.98x slower

    View Slide

  35. AR cache makes it faster, but not faster enough
    ● AR cache needs to cache boolean value for `exists?`
    ● But it needs to cache AR model objects for query fetching objects
    ○ Initialization AR model is heavy

    View Slide

  36. Conclusion

    View Slide

  37. Conclusion
    ● Imagine and confirm your code executes what SQL
    ○ Use debug log and `#to_sql`
    ● You can find the benchmark code from pocke/kaigi-on-rails-2023 GitHub
    repository.
    Thanks for your listening!

    View Slide