Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Fix SQL N+1 queries with RuboCop #rubykaigi

Fix SQL N+1 queries with RuboCop #rubykaigi

sue445

May 12, 2023
Tweet

More Decks by sue445

Other Decks in Technology

Transcript

  1. Fix SQL N+1 queries with
    RuboCop
    RubyKaigi 2023 / #rubykaigi #rubykaigiC
    pixiv Inc.
    sue445
    2023.5.12

    View full-size slide

  2. 3
    My name is Go
    The Go gopher was designed by Renee French. (http://reneefrench.blogspot.com/) The design is licensed under the
    Creative Commons 4.0 Attributions license. Read this article for more details: https://blog.golang.org/gopher

    View full-size slide

  3. 4
    About sue445
    ● Go Sueyoshi (a.k.a. sue445, sue-san)
    ● https://twitter.com/sue445
    ● https://github.com/sue445
    sue445

    View full-size slide

  4. 5
    About sue445
    ● Ruby: 11 years
    ● Golang: 7 years
    ● Go: 41 years (age)
    sue445

    View full-size slide

  5. 6
    sue445 and RubyKaigi
    ● 2019: Fukuoka
    ● 2020: Matsumoto
    ○ My proposal has been accepted, but
    Kaigi was canceled…
    ● 2023: Matsumoto
    ○ again!
    sue445

    View full-size slide

  6. 7
    I’m happy!

    View full-size slide

  7. 8
    I’m happy! So, This is Happi(法被)!

    View full-size slide

  8. ● https://rubygems.org/profiles/sue445
    ○ e.g. rubicure (Ruby implementatation of Pretty Cure)
    ○ https://github.com/sue445/rubicure (since 2013, 10th Anniversary!)
    ● Maintainer of Itamae, rspec-parameterised and more more gems
    9
    Gems I Implemented

    View full-size slide

  9. 10
    About pixiv Inc.

    View full-size slide

  10. 11
    About pixiv Inc.
    https://www.pixiv.co.jp/service/

    View full-size slide

  11. ● We support followings!
    ● Ruby Association
    ○ https://www.ruby.or.jp/ja/sponsors/list/
    pixiv
    ● Rails Girls Japan
    ○ https://railsgirls.jp/sponsors/
    ● International Ruby Programming Contest
    ○ https://www.ruby-procon.net/corp-messa
    ge/
    12
    [*$pixiv].all? { |members| love? Ruby } #=> true

    View full-size slide

  12. https://conference.pixiv.co.jp/2023/rubymusicmixin
    13
    RubyMusicMixin 2023

    View full-size slide

  13. 14
    RubyMusicMixin 2023
    After Patry
    MusicPartyMixin 2023

    View full-size slide

  14. ● Infrastructure Section
    ● AWS and GCP
    ○ Organization Owner
    ○ Solution Architect
    ● Maintainer of In-house tools (GitLab, Sentry etc)
    ● Other many many chore works
    15
    Day job

    View full-size slide

  15. ● SQL N+1 query detection and auto-correction
    ● Sharing of findings from custom cop creation
    16
    Goal in this talk

    View full-size slide

  16. ● About ISUCON
    ● About rubocop-isucon
    ● The path to implementing Isucon/Mysql2/NPlusOneQuery
    ● Result
    ● Conclusion
    ● Appendix
    17
    Agenda

    View full-size slide

  17. ● By connecting to the database from cop, the possibilities for cop are
    limitless
    ● Detecting N+1 queries is easy
    ● Limited patterns, but auto-correct of N+1 queries is possible
    ● ISUCON is Mixed Martial Arts
    18
    Conclusion

    View full-size slide

  18. ● https://github.com/sue445/rubocop-isucon
    19
    What I will talk about in this talk

    View full-size slide

  19. ● Other gems I made for ISUCON
    ● Itamae recipe I made for ISUCON
    ● Monitor Sinatra app with Datadog
    ● Implement https://github.com/tkuchiki/alp with fluentd and Datadog
    dashboard
    ● Ruby 3.2-dev YJIT vs Ruby 3.1 YJIT
    See my articles (ja)
    ● https://sue445.hatenablog.com/entry/2022/07/04/084915
    ● https://sue445.hatenablog.com/entry/2022/07/24/190412
    20
    What I will NOT talk about in this talk

    View full-size slide

  20. 21
    About ISUCON

    View full-size slide

  21. ● ISUCON is the most famous performance tuning contest in Japan. (since
    2012)
    ○ https://isucon.net/
    ○ ISUCON is a trademark or registered trademark of LINE Corporation.
    ● Multiple reference implementations are provided for applications
    ○ e.g. Go, Node.js, Perl, PHP, Python, Ruby, Rust
    22
    About ISUCON

    View full-size slide

  22. ● Regulation
    ○ 1-3 people /team
    ○ 8 hours
    ○ Vale Tudo
    ■ Except for using a servers other than the prepared server
    23
    About ISUCON

    View full-size slide

  23. ● Ruby
    ○ Sinatra
    ○ mysql2 + mysql2-cs-bind
    ○ Puma
    ● nginx
    ● MySQL (or MariaDB)
    ● 3 VMs
    24
    Common initial configurations

    View full-size slide

  24. ● Improvement targets EVERYTHING in servers
    ● e.g.
    ○ Fix application
    ○ Add index to table
    ○ Add cache
    ○ Upgrade pre-installed middlewares, tuning configuration
    ○ Use ruby-head
    ■ ruby-head is the fastest Ruby in the world!
    25
    ISUCON is Mixed Martial Arts

    View full-size slide

  25. ● 2021/8: ISUCON 11 failing to qualify
    ○ 1-man team
    ● 1 person is not enough
    ● SQL N+1 queries occur frequently in ISUCON
    ○ -> Let's automate!
    ● Compete in my area of expertise
    26
    ISUCON and sue445

    View full-size slide

  26. 27
    About rubocop-isucon

    View full-size slide

  27. ● RuboCop plugin for ruby reference implementation of ISUCON
    ● Development period: 2021/8 (ISUCON 11) - 2022/7 (ISUCON 12)
    ● Features
    ○ Isucon/Sinatra department
    ○ Isucon/Mysql2 department
    ○ Isucon/Mysql2/NPlusOneQuery cop
    ■ Checks that there’s no N+1 query
    ■ Main topic of this talk
    28
    https://github.com/sue445/rubocop-isuco
    n

    View full-size slide

  28. ● SQL being called inside a loop
    ● Model Answers: e.g. Join tables
    29
    About SQL N+1 queries
    courses = db.xquery(
    "SELECT `courses`.*" \\
    " FROM `courses`" \\
    " JOIN `registrations` ON `courses`.`id` = `registrations`.`course_id`" \\
    " WHERE `courses`.`status` != ? AND `registrations`.`user_id` = ?",
    STATUS_CLOSED, user_id,
    )
    courses.map do |course|
    teacher = db.xquery('SELECT * FROM `users` WHERE `id` = ?', course[:teacher_id]).first
    # do something
    end

    View full-size slide

  29. ● RuboCop has a plug-in mechanism. (ordinary gem)
    ● I use RuboCop on a regular basis and am familiar with its use
    ● I have created a custom cop and tool for RuboCop in the past
    ○ https://github.com/sue445/rubocop-itamae
    ○ https://github.com/sue445/rubocop_auto_corrector
    30
    Q. Why RuboCop?

    View full-size slide

  30. ● https://github.com/rubocop/rubocop-performance
    ● This is also valid, but it is not enough
    ● In fact, only a few cases were detected when run against the last 10+ years
    of source code
    31
    vs rubocop-performance

    View full-size slide

  31. ● https://github.com/xinminlabs/synvert-ruby
    ● Synvert specializes in automatic code rewriting
    ● RuboCop can detect violations and automatically correct them if possible
    ○ In some cases, the machine may not be able to automatically correct the
    violation, so I only want to detect violations at that time
    ○ e.g. nginx config, database index
    32
    vs Synvert

    View full-size slide

  32. # bad
    class App < Sinatra::Base
    get '/' do
    content_type :html
    File.read(File.join(__dir__, '..', 'public', 'index.html'))
    end
    end
    33
    e.g. Isucon/Sinatra/ServeStaticFile
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/ServeStaticFile.html

    View full-size slide

  33. # bad
    class App < Sinatra::Base
    get '/' do
    ^^^^^^^^^^ Isucon/Sinatra/ServeStaticFile: Serve static files on front server (e.g.
    nginx) instead of sinatra app
    content_type :html
    File.read(File.join(__dir__, '..', 'public', 'index.html'))
    end
    end
    34
    e.g. Isucon/Sinatra/ServeStaticFile
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/ServeStaticFile.html

    View full-size slide

  34. # good (e.g. Serve on nginx)
    location / {
    try_files $uri $uri/ /index.html;
    }
    35
    e.g. Isucon/Sinatra/ServeStaticFile
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/ServeStaticFile.html

    View full-size slide

  35. # bad
    db.xquery('SELECT id, title FROM articles WHERE user_id = ?', user_id)
    36
    e.g. Isucon/Mysql2/WhereWithoutIndex
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/WhereWithoutIndex.html

    View full-size slide

  36. # bad (user_id is not indexed)
    db.xquery('SELECT id, title FROM articles WHERE user_id = ?', user_id)
    ^^^^^^^^^^^
    Isucon/Mysql2/WhereWithoutIndex: This where clause doesn't seem to have an index.
    (e.g. `ALTER TABLE articles ADD INDEX index_user_id (user_id)`)
    37
    e.g. Isucon/Mysql2/WhereWithoutIndex
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/WhereWithoutIndex.html

    View full-size slide

  37. ALTER TABLE articles ADD INDEX index_user_id (user_id)
    38
    e.g. Isucon/Mysql2/WhereWithoutIndex
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/WhereWithoutIndex.html
    # good (user_id is indexed)
    db.xquery('SELECT id, title FROM articles WHERE used_id = ?', user_id)

    View full-size slide

  38. # bad (user_id is not indexed)
    db.xquery('SELECT id, title FROM articles JOIN users ON users.id = articles.user_id')
    ^^^^^^^^^^^^^^^^
    Isucon/Mysql2/JoinWithoutIndex: This join clause doesn't seem to have an index. (e.g.
    `ALTER TABLE articles ADD INDEX index_user_id (user_id)`)
    # good (user_id is indexed)
    db.xquery('SELECT id, title FROM articles JOIN users ON users.id = articles.user_id')
    39
    e.g. Isucon/Mysql2/JoinWithoutIndex
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/JoinWithoutIndex.html

    View full-size slide

  39. ● Get information in table using ActiveRecord
    ○ e.g. columns, primary key, indexes
    40
    Get database schema

    View full-size slide

  40. ● Using SELECT * is slow because it retrieves all columns and assigns them to
    Ruby objects
    ● Only necessary columns should be passed to the SELECT clause
    ● All columns are output to the SELECT clause to make it easier to remove
    unnecessary columns


    41
    e.g. Isucon/Mysql2/SelectAsterisk
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/SelectAsterisk.html
    # bad
    db.xquery('SELECT * FROM users')
    # good (Auto-corrected!)
    db.xquery('SELECT id, name FROM users')

    View full-size slide

  41. 42
    Tips: Share settings with cops in the same department
    # .rubocop.yml
    Isucon/Mysql2:
    Database:
    adapter: mysql2
    host: localhost
    database: isucon_db
    username: isucon
    password: isucon
    encoding: utf8
    port: 3306

    View full-size slide

  42. ● Detect SQL N+1 queries
    ● Auto-correct SQL N+1 queries
    43
    The path to implementing Isucon/Mysql2/NPlusOneQuery

    View full-size slide

  43. ● Performance/CollectionLiteralInLoop code in rubocop-performance is
    helpful
    ○ https://docs.rubocop.org/rubocop-performance/cops_performance.htm
    l#performancecollectionliteralinloop
    ○ https://github.com/rubocop/rubocop-performance/blob/v1.11.5/lib/ru
    bocop/cop/performance/collection_literal_in_loop.rb
    ● Can be detected by AST if in the same method, but not if separate methods
    are used…
    44
    Detect SQL N+1 queries

    View full-size slide

  44. ● Parse SQL string and get as SQL AST
    ● Make SQL AST available to RuboCop
    ● Find N+1 query patterns that can be mechanically modified
    ● Avoid false positives
    45
    Auto-correct SQL N+1 queries

    View full-size slide

  45. ● https://github.com/tenderlove/gda
    ○ Wrapper gem for https://gitlab.gnome.org/GNOME/libgda
    ● I tried a few others but could not get the expected results
    ○ In particular, these gems could not parse isucon10-final's complex SQL
    ○ https://github.com/isucon/isucon10-final/blob/e858b25/webapp/ruby
    /app.rb#L250-L318
    ■ a.k.a. Tourist Attractions
    46
    Parse SQL string and get as SQL AST

    View full-size slide

  46. https://github.com/isucon/isucon10-final/blob/e858b25/webapp/ruby/app.rb#L250-L318
    47
    Tourist Attractions

    View full-size slide

  47. ● libgda does not support placeholder(?) and backquotes(`), so it is removed in
    rubocop-isucon
    ● Deleting would change the location information, so care is taken to ensure
    that the number of characters does not change
    ○ https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/is
    ucon/gda.rb#L19-L25
    48
    Tips
    # Before
    db.xquery('SELECT id, title FROM `articles` WHERE user_id = ?', user_id)
    # After
    db.xquery('SELECT id, title FROM articles WHERE user_id = 0', user_id)

    View full-size slide

  48. ● o
    ● aaaa
    49
    GDA gem example
    require "gda"
    sql = "SELECT id, name FROM users"
    statement = GDA::SQL::Parser.new.parse(sql)
    ast = statement.ast
    ast.from.targets.map(&:table_name)
    # => ["users"]
    ast.expr_list.map { |select_field| select_field.expr.value }
    # => ["id", "name"]

    View full-size slide

  49. ● o
    ● aaaa
    50
    Get SQL AST in Hash format
    pp JSON.parse(statement.serialize)
    {"statement"=>
    {"sql"=>"SELECT id, name FROM users",
    "stmt_type"=>"SELECT",
    "contents"=>
    {"distinct"=>"false",
    "fields"=>
    [{"expr"=>{"value"=>"id"}, "field_name"=>"id"},
    {"expr"=>{"value"=>"name"}, "field_name"=>"name"}],
    "from"=>
    {"targets"=>[{"expr"=>{"value"=>"users"}, "table_name"=>"users"}]}}}}

    View full-size slide

  50. ● AST retrieved by libgda does not contain SQL location information
    ● Cannot display offences in RuboCop or rewrite any node of SQL
    51
    Drawbacks of GDA gem (libgda)
    db.xquery(<<~SQL, jia_user_id)
    SELECT * FROM `isu`
    WHERE `jia_user_id` = ? ORDER BY `id` DESC
    ^^^^^^^^^^^^^^^^^ Isucon/Mysql2/WhereWithoutIndex: This where clause
    doesn't seem to have an index. (e.g. 'ALTER TABLE `isu` ADD INDEX
    `index_jia_user_id` (jia_user_id)')
    SQL

    View full-size slide

  51. ● GDA gem provides visitor pattern for AST exploration like RuboCop
    ● I could calculate the node's own location during the AST search
    52
    Solution: Implemented myself

    View full-size slide

  52. 53
    RuboCop::Isucon::GDA::NodePatcher
    class NodePatcher < ::GDA::Visitors::Visitor
    # @param node [GDA::Nodes::Operation]
    def visit_GDA_Nodes_Operation(node)
    return super unless node.operator
    pattern = operand_pattern(node)
    return super unless pattern
    node.location = search_operation_location(pattern)
    super
    end
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/isucon/gda/node_patcher.rb#L19-L29

    View full-size slide

  53. ● This is inspired by paser gem’s Parser::Source::Range
    54
    class NodeLocation
    # @return [Integer]
    attr_reader :begin_pos
    # @return [Integer]
    attr_reader :end_pos
    # @return [String]
    attr_reader :body
    RuboCop::Isucon::GDA::NodeLocation
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/isucon/gda/node_location.rb#L7-L15

    View full-size slide

  54. 55
    spec
    it "location is appended" do
    subject
    where_clause =
    node.where_cond.to_a.
    select { |node| node.instance_of?(GDA::Nodes::Operation) && node.operator }
    expect(where_clause.count).to eq 3
    expect(where_clause[0].location).to eq location(begin_pos: 26, end_pos: 32, body: "id = 1")
    expect(where_clause[1].location).to eq location(begin_pos: 37, end_pos: 46, body: "stock > 0")
    expect(where_clause[2].location).to eq location(begin_pos: 51, end_pos: 67, body: "name IS NOT NULL")
    end
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/spec/rubocop/isucon/gda/node_patcher_spec.rb#L10-L35

    View full-size slide

  55. 56
    However, spec is failed!!!!
    1.1) Failure/Error: expect(where_clause[0].location).to eq location(begin_pos: 26, end_pos:
    32, body: "id = 1")
    expected: #@end_pos=32, @body="id = 1">
    got: nil
    (compared using ==)
    # ./spec/rubocop/isucon/gda/node_patcher_spec.rb:31:in `block (5 levels) in (required)>'
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/spec/rubocop/isucon/gda/node_patcher_spec.rb#L10-L35

    View full-size slide

  56. ● Because each call to an AST method like where_cond creates a new instance
    ● Even if I embed my own instance variable(location) in the Node class in the
    Visitor, I will not be able to retrieve it the next time you call the same
    method
    57
    Why????

    View full-size slide

  57. ● Memoizationneeds to be planted for all AST methods
    ● But there are 30+ such AST methods
    ● AST methods are methods in the C implementation, so prepend could not be
    used
    58
    Solution: memoization, but…
    def where_cond
    @where_cond ||= original_where_cond
    end

    View full-size slide

  58. 59
    Solution: alias_method_chain like patch
    def where_cond_with_cache
    @where_cond_with_cache ||= where_cond_without_cache
    end
    alias_method :where_cond_without_cache, :where_cond
    alias_method :where_cond, :where_cond_with_cache

    View full-size slide

  59. https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/isucon/gda/gda_ext.rb
    60
    I created MemorizeMethods
    GDA::Nodes::Select.class_eval do
    extend RuboCop::Isucon::MemorizeMethods
    memorize :distinct_expr
    memorize :expr_list
    memorize :from
    memorize :where_cond
    memorize :group_by
    memorize :having_cond
    memorize :order_by
    memorize :limit_count
    memorize :limit_offset
    end

    View full-size slide

  60. Which of the following is the fastest memorize?
    61
    Ruby Quiz

    View full-size slide

  61. 62
    1. define_method + instance_variable_xxx + send
    def self.memorize(method_name)
    define_method "#{method_name}_with_cache" do
    if (ret = instance_variable_get("@#{method_name}_with_cache"))
    ret
    else
    ret = send("#{method_name}_without_cache")
    instance_variable_set("@#{method_name}_with_cache", ret)
    ret
    end
    end
    alias_method "#{method_name}_without_cache", method_name
    alias_method method_name, "#{method_name}_with_cache"
    end

    View full-size slide

  62. 63
    2. define_method + Hash + send
    def self.memorize(method_name)
    define_method "#{method_name}_with_cache" do
    @__cache ||= {}
    @__cache[method_name] ||= send("#{method_name}_without_cache")
    end
    alias_method "#{method_name}_without_cache", method_name
    alias_method method_name, "#{method_name}_with_cache"
    end

    View full-size slide

  63. 64
    3. class_eval
    def self.memorize(method_name)
    class_eval <<~RUBY, __FILE__, __LINE__ + 1
    # def foo_with_cache
    # @foo_with_cache ||= foo_without_cache
    # end
    def #{method_name}_with_cache
    @#{method_name}_with_cache ||= #{method_name}_without_cache
    end
    RUBY
    alias_method "#{method_name}_without_cache", method_name
    alias_method method_name, "#{method_name}_with_cache"
    end

    View full-size slide

  64. Which of the following is the fastest memorize?
    1. define_method + instance_variable_xxx + send
    2. define_method + Hash + send
    3. class_eval
    65
    Ruby Quiz

    View full-size slide

  65. 3. class_eval
    67
    Answer

    View full-size slide

  66. 1. define_method + instance_variable_xxx + send
    ● 701.502k (±13.8%) i/s
    2. define_method + Hash + send
    ● 1.549M (± 2.3%) i/s
    3. class_eval
    ● 4.866M (± 8.1%) i/s
    68
    Benchmark
    ● https://github.com/sue445/rubocop-isucon/tree/v0.2.0/benchmark#memorizerb
    ● https://github.com/sue445/rubocop-isucon/blob/v0.2.0/benchmark/memorize.rb

    View full-size slide

  67. eval is evil but faster  
    evalは悪(evil)だが速くなる
    ※define_method is slower than def
    69
    tl;dr;

    View full-size slide

  68. ● This is AST’s Reincarnation into another world (異世界転生)
    ● Basically, just add Parser::Source::Range (Ruby AST location information)
    and RuboCop::Isucon::GDA::NodeLocation (SQL AST location information)
    ● However, Ruby is very expressive even when writing a single string, so it
    was necessary to consider many different cases.
    70
    Bringing SQL AST into the Ruby World

    View full-size slide

  69. 71
    Case 1: single line
    db.xquery('SELECT * FROM `isu` WHERE `jia_user_id` = ? ORDER BY `id`
    DESC', jia_user_id)

    View full-size slide

  70. 72
    Case 1: single line
    'SELECT * FROM `isu` WHERE `jia_user_id` = ? ORDER BY `id` DESC'
    #=> "SELECT * FROM `isu` WHERE `jia_user_id` = ? ORDER BY `id` DESC"

    View full-size slide

  71. 73
    Case 2: Heredoc (<<, <<-)
    db.xquery(<<-SQL, jia_user_id)
    SELECT * FROM `isu`
    WHERE `jia_user_id` = ?
    ORDER BY `id` DESC
    SQL

    View full-size slide

  72. 74
    Case 2: Heredoc (<<, <<-)
    <<-SQL
    SELECT * FROM `isu`
    WHERE `jia_user_id` = ?
    ORDER BY `id` DESC
    SQL
    #=> " SELECT * FROM `isu`\n WHERE `jia_user_id` = ?\n ORDER BY `id`
    DESC\n"

    View full-size slide

  73. 75
    Case 3: Heredoc (<<~) since Ruby 2.3
    db.xquery(<<~SQL, jia_user_id)
    SELECT * FROM `isu`
    WHERE `jia_user_id` = ?
    ORDER BY `id` DESC
    SQL

    View full-size slide

  74. 76
    Case 3: Heredoc (<<~) since Ruby 2.3
    <<~SQL
    SELECT * FROM `isu`
    WHERE `jia_user_id` = ?
    ORDER BY `id` DESC
    SQL
    #=> "SELECT * FROM `isu`\nWHERE `jia_user_id` = ?\nORDER BY `id` DESC\n"

    View full-size slide

  75. 77
    Case 4: Escape concatenation at end of
    line
    classes = db.xquery(
    "SELECT *" \
    " FROM `classes`" \
    " WHERE `course_id` = ?" \
    " ORDER BY `part` DESC",
    course[:id]
    )

    View full-size slide

  76. 78
    Case 4: Escape concatenation at end of
    line
    "SELECT *" \
    " FROM `classes`" \
    " WHERE `course_id` = ?" \
    " ORDER BY `part` DESC"
    #=> "SELECT * FROM `classes` WHERE `course_id` = ? ORDER BY `part` DESC"

    View full-size slide

  77. ● As is the case with custom Cop in general, it is easy to just detect offenses
    ● However, it is very difficult to deal with auto correct of offenses
    ● It took me about 2 months to figure out what kind of SQL N+1 query could
    be automatically modified by RuboCop
    ● It took me 1 month to come up with the idea and actually implement it
    79
    Find N+1 query patterns that can be mechanically modified

    View full-size slide

  78. ● I found a hint in ActiveRecord
    ● ActiveRecord has a function called includes to avoid N+1 and issue a query
    when an association between models is defined by has_many or belongs_to
    ● ISUCON's reference implementation doesn’t use ActiveRecord, but plain SQL.
    ● However, I looked at the plain SQL and thought that an association-like SQL
    and table structure could improve the N+1 query by automatically modifying
    the SQL in the same way
    80
    How did I come up with the idea?

    View full-size slide

  79. 81 https://guides.rubyonrails.org/active_record_querying.html#includes
    books = Book.limit(10)
    books.each do |book|
    puts book.author.last_name
    end
    Excerpt from Ruby on Rails Guides
    SELECT books.* FROM books LIMIT 10
    SELECT authors.* FROM authors WHERE authors.book_id = 1
    SELECT authors.* FROM authors WHERE authors.book_id = 2
      :
      :
    11 queries

    View full-size slide

  80. 82 https://guides.rubyonrails.org/active_record_querying.html#includes
    books = Book.includes(:author).limit(10)
    books.each do |book|
    puts book.author.last_name
    end
    Excerpt from Ruby on Rails Guides
    SELECT books.* FROM books LIMIT 10
    SELECT authors.* FROM authors
    WHERE authors.book_id IN (1,2,3,4,5,6,7,8,9,10)
    2 queries

    View full-size slide

  81. ● To prevent false positives, I examined patterns that could be safely and
    automatically corrected
    83
    Avoid false positives

    View full-size slide

  82. ● SELECT statement
    ● Only one table exists in SQL. (Does not include UNION, JOIN, or subqueries)
    ● GROUP BY clause doesn’t exist
    ● Aggregate functions (e.g. COUNT, MAX, MIN, SUM, AVG) doesn’t exist in the
    SELECT clause
    ● Only one condition in the WHERE clause and the column in which it appears is
    either a Primary Key or a single column Unique Key
    84
    Conditions for safe automatic correction

    View full-size slide

  83. 85
    class Mysql2NPlusOneQueryCorrector
    def correct
    replace if correctable?
    end
    end
    RuboCop::Cop::Isucon::Correctors::Mysql2NPlusOneQueryCorrector
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/cop/isucon/correctors/n_plus_one_query_corrector.rb
    #L69-L71

    View full-size slide

  84. 86
    module CorrectableMethods
    def correctable?
    correctable_gda? && correctable_xquery_arg? &&
    correctable_parent_receiver? && current_node.child_nodes.count == 3 &&
    xquery_lvar.lvasgn_type? && %i[first last].include?(xquery_chained_method)
    end
    end
    RuboCop::Cop::Isucon::Correctors::Mysql2NPlusOneQueryCorrector::CorrectableMethods
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/cop/isucon/correctors/n_plus_one_query_corrector/co
    rrectable_methods.rb#L11-L15

    View full-size slide

  85. 87
    class Mysql2NPlusOneQueryCorrector
    module ReplaceMethods
    def replace
    replace_where_condition_in_sql
    replace_xquery_2nd_arg
    replace_chained_method_to_each_with_object
    replace_to_2_lines
    end
    end
    end
    RuboCop::Cop::Isucon::Correctors::Mysql2NPlusOneQueryCorrector::ReplaceMethods
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/lib/rubocop/cop/isucon/correctors/n_plus_one_query_corrector/re
    place_methods.rb#L10-L15

    View full-size slide

  86. 88
    Before auto-correct
    courses.map do |course|
    teacher = db.xquery('SELECT * FROM `users` WHERE `id` = ?', course[:teacher_id]).first
    end

    View full-size slide

  87. 89
    Plot for auto-correct
    user_ids = courses.map { |course| course[:teacher_id] }
    users = db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', user_ids)
    users_by_id = users.each_with_object({}) do |user, hash|
    hash[user[:id]] = user
    end
    courses.map do |course|
    teacher = users_by_id[course[:teacher_id]]
    end

    View full-size slide

  88. 90
    Plot for auto-correct
    user_ids = courses.map { |course| course[:teacher_id] }
    users = db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', user_ids)
    users_by_id = users.each_with_object({}) do |user, hash|
    hash[user[:id]] = user
    end
    courses.map do |course|
    teacher = users_by_id[course[:teacher_id]]
    end
    Convert WHERE `id` = ? inside the loop to WHERE `id` IN
    (?) outside the loop to avoid N+1 queries

    View full-size slide

  89. 91
    Plot for auto-correct
    user_ids = courses.map { |course| course[:teacher_id] }
    users = db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', user_ids)
    users_by_id = users.each_with_object({}) do |user, hash|
    hash[user[:id]] = user
    end
    courses.map do |course|
    teacher = users_by_id[course[:teacher_id]]
    end
    Create a Hash with user_id as key and
    user as value

    View full-size slide

  90. 92
    After auto-correct
    courses.map do |course|
    @users_by_id ||= db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', courses.map {
    |course| course[:teacher_id] }).each_with_object({}) { |v, hash| hash[v[:id]] = v }
    teacher = @users_by_id[course[:teacher_id]]
    end

    View full-size slide

  91. ● It is difficult to split one line of processing into multiple lines in RuboCop
    ○ Because it is necessary to consider indentation
    ● I'm trying to reduce the number of lines as much as possible, so I'm using
    one-liners to push method chains in and out to get more lines
    ● However, it was difficult to understand this case if it was completely on
    one line, so I made it on two lines as a bitter effort
    93
    Tips

    View full-size slide

  92. 94
    1 line ver vs 2 lines ver
    # 1 line
    @users_by_id ||= db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', courses.map { |course|
    course[:teacher_id] }).each_with_object({}) { |v, hash| hash[v[:id]] = v }; teacher =
    @users_by_id[course[:teacher_id]]
    # 2 lines
    @users_by_id ||= db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', courses.map { |course|
    course[:teacher_id] }).each_with_object({}) { |v, hash| hash[v[:id]] = v }
    teacher = @users_by_id[course[:teacher_id]]

    View full-size slide

  93. 95
    Write test code for auto corrected code
    it "auto-corrected code is valid" do
    corrected_code = <<~RUBY
    courses = [{"id" => 1, "teacher_id" => 1}]
    courses.map do |course|
    @users_by_id ||= db.xquery('SELECT * FROM `users` WHERE `id` IN (?)', courses.map { |course| course["teacher_id"]
    }).each_with_object({}) { |v, hash| hash[v["id"]] = v }
    teacher = @users_by_id[course["teacher_id"]]
    teacher["name"]
    end
    RUBY
    expect_correction(corrected_code)
    teacher_names = eval(corrected_code) # rubocop:disable Security/Eval
    expect(teacher_names).to eq(["user_1"])
    end
    https://github.com/sue445/rubocop-isucon/blob/v0.2.0/spec/rubocop/cop/isucon/mysql2/n_plus_one_query_spec.rb#L769-L793

    View full-size slide

  94. ● Less than 10% of all N+1 queries could be corrected automatically with this
    method
    ● isucon12 qualify’s theme was Multi-tenant SaaS app
    ○ https://isucon.net/archives/56850281.html (ja)
    ○ https://github.com/isucon/isucon12-qualify
    ○ MySQL + SQLite!!!
    ● ISUCON is a mixed martial art, so just being able to do static analysis and
    automatic modification of N+1 queries is not enough to win
    96
    Result

    View full-size slide

  95. ● By connecting to the database from cop, the possibilities for cop are
    limitless
    ● Detecting N+1 queries is easy
    ● Limited patterns, but auto-correct of N+1 queries is possible
    ● ISUCON is Mixed Martial Arts
    97
    Conclusion

    View full-size slide

  96. ● Introduce cops included in rubocop-isucon
    98
    Appendix

    View full-size slide

  97. ● o
    ● aaaa
    99
    Isucon/Mysql2/ManyJoinTable
    # bad
    totals = db.xquery(
    ^^^^^^^^^^ Isucon/Mysql2/ManyJoinTable: Avoid SQL with lots of JOINs
    "SELECT IFNULL(SUM(`submissions`.`score`), 0) AS `total_score`" \\
    " FROM `users`" \\
    " JOIN `registrations` ON `users`.`id` = `registrations`.`user_id`" \\
    " JOIN `courses` ON `registrations`.`course_id` = `courses`.`id`" \\
    " LEFT JOIN `classes` ON `courses`.`id` = `classes`.`course_id`" \\
    " LEFT JOIN `submissions` ON `users`.`id` = `submissions`.`user_id` AND
    `submissions`.`class_id` = `classes`.`id`" \\
    " WHERE `courses`.`id` = ?" \\
    " GROUP BY `users`.`id`",
    course[:id]
    ).map { |_| _[:total_score] }
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/ManyJoinTable.html

    View full-size slide

  98. ● o
    ● aaaa
    100
    Isucon/Mysql2/PrepareExecute
    # bad (auto-correct is possible)
    db.prepare('SELECT * FROM `users` WHERE `id` = ?').execute(
    session[:user][:id]
    ).first
    # good
    require 'mysql2-cs-bind'
    db.xquery('SELECT * FROM `users` WHERE `id` = ?',
    session[:user][:id]
    ).first
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Mysql2/PrepareExecute.html

    View full-size slide

  99. ● o
    ● aaaa
    101
    Isucon/Sinatra/DisableLogging
    # bad
    class App < Sinatra::Base
    enable :logging
    end
    # bad
    class App < Sinatra::Base
    end
    # good
    class App < Sinatra::Base
    disable :logging
    end
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/DisableLogging.html

    View full-size slide

  100. ● o
    ● aaaa
    102
    Isucon/Sinatra/Logger
    # bad
    logger.error "Search condition not found"
    # good
    # no logging
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/Logger.html

    View full-size slide

  101. ● o
    ● aaaa
    103
    Isucon/Sinatra/RackLogger
    # bad
    request.env['rack.logger'].warn 'drop post isu condition request'
    # good
    # no logging
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Sinatra/RackLogger.html

    View full-size slide

  102. ● o
    ● aaaa
    104
    Isucon/Shell/Backtick
    # bad
    def digest(src)
    `printf "%s" #{Shellwords.shellescape(src)} | openssl dgst -sha512 |
    sed 's/^.*= //'`.strip
    end
    # good
    def digest(src)
    OpenSSL::Digest::SHA512.hexdigest(src)
    end
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Shell/Backtick.html

    View full-size slide

  103. ● o
    ● aaaa
    105
    Isucon/Shell/System
    # bad
    system("sleep 1")
    # good
    sleep 1
    https://sue445.github.io/rubocop-isucon/RuboCop/Cop/Isucon/Shell/System.html

    View full-size slide

  104. ● I proposed to @koic to introduce these cops to rubocop-performance, but
    rejected
    106
    BTW

    View full-size slide

  105. ● These are the same features as Isucon/Mysql2
    ○ Isucon/Sqlite3/JoinWithoutIndex
    ○ Isucon/Sqlite3/ManyJoinTable
    ○ Isucon/Sqlite3/NPlusOneQuery
    ○ Isucon/Sqlite3/SelectAsterisk
    ○ Isucon/Sqlite3/WhereWithoutIndex
    107
    Isucon/Sqlite3 department

    View full-size slide