$30 off During Our Annual Pro Sale. View Details »

Code indexing: How language servers understand our code

Code indexing: How language servers understand our code

Language servers are a way of providing IDE features to any editor. Specialized functionality for navigating and understanding our Ruby code can greatly improve the developer experience and is highly aligned with the goal of making the developer happy.

In the context of the Ruby LSP, let’s dive into how language servers can build up knowledge about codebases using indexing and how it is used to implement features such as go to definition.

Vinicius Stock

May 31, 2023
Tweet

More Decks by Vinicius Stock

Other Decks in Programming

Transcript

  1. Vinicius Stock
    Code indexing
    How language servers understand our code

    View Slide

  2. Vinicius Stock
    Senior dev @ Ruby DX team


    Shopify
    Twitter: @vinistock


    GitHub: @vinistock


    https://vinistock.com

    View Slide

  3. What is a language server?

    View Slide

  4. Source: https://microsoft.github.io/language-server-protocol/specification

    View Slide

  5. Editor


    (Client)
    Language
    server
    STDIN
    STDOUT
    User
    Go to def
    Definition
    Location
    Jumps to def

    View Slide

  6. Language servers can
    connect to any editor

    View Slide

  7. We can collaborate in
    improving the experience

    View Slide

  8. Improving the development
    experience with language
    servers


    RubyConf 2022

    View Slide

  9. Shopify/vscode-ruby-lsp


    Shopify/ruby-lsp
    Ruby LSP

    View Slide

  10. Semantic highlighting
    Document symbol
    Document link
    Hover
    Folding range
    Selection range
    Formatting
    On type formatting
    Diagnostic
    Code actions
    Document highlight
    Inlay hints
    Path completion
    Code Lens

    View Slide

  11. Document symbol

    View Slide

  12. Semantic highlighting
    Document symbol
    Document link
    Hover
    Folding range
    Selection range
    Formatting
    On type formatting
    Diagnostic
    Code actions
    Document highlight
    Inlay hints
    Path completion
    Code Lens

    View Slide

  13. Diagnostic Code actions

    View Slide

  14. Code actions

    View Slide

  15. Semantic highlighting
    Document symbol
    Document link
    Hover
    Folding range
    Selection range
    Formatting
    On type formatting
    Diagnostic
    Code actions
    Document highlight
    Inlay hints
    Path completion
    Code Lens

    View Slide

  16. Code Lens

    View Slide

  17. To explore code indexing,
    we’ll think about


    go to definition

    View Slide

  18. class Foo


    def process


    Bar.baz


    end


    end

    View Slide

  19. {


    "params": {


    "textDocument": {


    "uri": "file:
    /
    /
    /
    foo.rb"


    },


    "position": {


    "line": 2,


    "character": 5


    }


    }


    }

    View Slide

  20. {


    "uri": "file:
    /
    /
    /
    bar.rb"


    "range": {


    "start": { "line": 0, "character": 0 }


    "end": { "line": 5, "character": 2 }


    }


    }

    View Slide

  21. The request and
    response are just file
    locations

    View Slide

  22. What’s happening in
    between?

    View Slide

  23. class Foo


    def process


    Bar.baz


    end


    end
    How do we determine what’s
    under the cursor?
    How do we find the definition?

    View Slide

  24. Locating targets
    Part 1

    View Slide

  25. We want to search an
    AST to find the node at
    the requested position

    View Slide

  26. We must parse the file
    and then search the AST

    View Slide

  27. 0 class Foo


    1 def process


    2 Bar.baz


    3 end


    4 end
    CLASS
    CONST (Foo)
    BODY
    DEF
    IDENT (process)
    BODY
    CALL
    CONST (Bar) IDENT (baz)

    View Slide

  28. More than one node
    covers the requested
    position

    View Slide

  29. We’re looking for the
    most specific one

    View Slide

  30. Converting position into string
    index
    Step 1

    View Slide

  31. "position": { "line": 2,"character": 5 }


    View Slide

  32. "position": { "line": 2,"character": 5 }


    source = "class Foo\n def process\n Bar.baz…"


    View Slide

  33. "position": { "line": 2,"character": 5 }


    source = "class Foo\n def process\n Bar.baz…"


    line + character
    =
    >
    string index


    View Slide

  34. "position": { "line": 2,"character": 5 }


    source = "class Foo\n def process\n Bar.baz…"


    line + character
    =
    >
    string index


    { "line": 2,"character": 5 }
    =
    >
    28

    View Slide

  35. Locating the AST node
    Step 2

    View Slide

  36. 1. Go through AST


    2. Use target and candidate
    pointers


    3. Compare proximity to
    requested index

    View Slide

  37. 0 class Foo


    1 def process


    2 Bar.baz


    3 end


    4 end
    CLASS
    CONST (Foo)
    BODY
    DEF
    IDENT
    (process)
    BODY
    CALL
    CONST (Bar) IDENT (baz)

    View Slide

  38. def locate(node, index)




    end

    View Slide

  39. def locate(node, index)


    queue = node.child_nodes


    target = node


    end

    View Slide

  40. def locate(node, index)


    queue = node.child_nodes


    target = node


    until queue.empty?


    end


    target


    end

    View Slide

  41. until queue.empty?


    candidate = queue.shift


    end

    View Slide

  42. until queue.empty?


    candidate = queue.shift


    queue.concat(candidate.child_nodes)


    end

    View Slide

  43. until queue.empty?


    candidate = queue.shift


    queue.concat(candidate.child_nodes)


    loc = candidate.location


    next unless loc.cover?(index)


    end

    View Slide

  44. until queue.empty?


    candidate = queue.shift


    queue.concat(candidate.child_nodes)


    loc = candidate.location


    next unless loc.cover?(index)


    break if index < loc.start_char


    end

    View Slide

  45. until queue.empty?


    # …


    break if index < loc.start_char


    tloc = target.location


    if loc.end_char - loc.start_char
    <
    =


    tloc.end_char - tloc.start_char


    target = candidate


    end


    end

    View Slide

  46. We discovered the target 🎉

    View Slide

  47. But where’s the definition?

    View Slide

  48. Code indexing
    Part 2

    View Slide

  49. We have to know what’s
    available in the codebase

    View Slide

  50. Index


    Classes


    Modules


    Methods


    Constants
    Does this thing exist?
    Where is it?

    View Slide

  51. We can’t populate the index
    on every definition request

    View Slide

  52. We need to initialize the
    index on boot

    View Slide

  53. And we have to keep it
    synchronized if files are
    modified

    View Slide

  54. Let’s start by the initial
    indexing

    View Slide

  55. We’ll focus implementing
    classes in 4 steps

    View Slide

  56. Ruby
    code
    Parser
    AST
    Visitor
    Index
    AST Classes
    For all Ruby files

    View Slide

  57. Collecting class declarations
    in an AST
    Step 1

    View Slide

  58. class IndexVisitor < SyntaxTree
    :
    :
    Visitor


    end

    View Slide

  59. class IndexVisitor < SyntaxTree
    :
    :
    Visitor


    attr_reader :constants


    def initialize


    @constants = []


    end


    end

    View Slide

  60. class IndexVisitor < SyntaxTree
    :
    :
    Visitor


    attr_reader :constants


    def initialize


    @constants = []


    end


    def visit_class(node)


    @constants
    <
    <
    node


    super


    end


    end

    View Slide

  61. visitor = IndexVisitor.new


    visitor.visit(ast)


    visitor.constants


    #
    =
    >
    [


    # #,


    # #


    # ]


    View Slide

  62. Creating the Index class
    Step 2

    View Slide

  63. class Index


    include Singleton


    def initialize


    @knowledge = {}


    end


    def add(path, constants)


    constants.each do |const|


    end


    end


    end

    View Slide

  64. class Index


    include Singleton


    def initialize


    @knowledge = {}


    end


    def add(path, constants)


    constants.each do |const|


    @knowledge[const.name] = {


    }


    end


    end


    end

    View Slide

  65. class Index


    include Singleton


    def initialize


    @knowledge = {}


    end


    def add(path, constants)


    constants.each do |const|


    @knowledge[const.name] = {


    file: path, location: const.location


    }


    end


    end


    end

    View Slide

  66. {


    "Foo"
    =
    >
    {


    path: "/foo.rb",


    location: #

    end_line=2,


    start_column=4,


    end_column=6,


    >


    }


    }

    View Slide

  67. Populating the Index
    Step 3

    View Slide

  68. index = Index.instance


    all_files_in_load_path.each |path|


    index.process("file
    :
    /
    /
    #
    {
    path}")


    end

    View Slide

  69. class Index


    def process(uri)


    path = URI(uri).path












    end


    end


    View Slide

  70. class Index


    def process(uri)


    path = URI(uri).path




    content = File.read(path)










    end


    end


    View Slide

  71. class Index


    def process(uri)


    path = URI(uri).path




    content = File.read(path)


    ast = SyntaxTree.parse(content)








    end


    end


    View Slide

  72. class Index


    def process(uri)


    path = URI(uri).path




    content = File.read(path)


    ast = SyntaxTree.parse(content)


    visitor = IndexVisitor.new


    visitor.visit(ast)




    end


    end


    View Slide

  73. class Index


    def process(uri)


    path = URI(uri).path




    content = File.read(path)


    ast = SyntaxTree.parse(content)


    visitor = IndexVisitor.new


    visitor.visit(ast)


    add(path, visitor.constants)


    end


    end


    View Slide

  74. The initial indexing is
    done

    View Slide

  75. Synchronizing file
    modifications
    Step 4

    View Slide

  76. If a file is modified, the
    existing classes might
    change

    View Slide

  77. LSPs can request file
    watching

    View Slide

  78. Remove
    related
    entries
    Index
    the new
    file
    Ruby file
    modified
    Deleted Added
    Remove
    and
    index
    Changed

    View Slide

  79. def execute(req)


    case req[:method]


    when "workspace/didChangeWatchedFiles"


    changes = req.dig(:params, :changes)


    Index.instance.synchronize(changes)


    end


    end

    View Slide

  80. class Index


    def synchronize(changes)


    changes.each do |change|


    case change[:type]


    end


    end


    end


    end

    View Slide

  81. class Index


    def synchronize(changes)


    changes.each do |change|


    case change[:type]


    when "deleted"


    remove_entries_for(change[:uri])


    end


    end


    end


    end

    View Slide

  82. class Index


    def synchronize(changes)


    changes.each do |change|


    case change[:type]


    when "deleted"


    remove_entries_for(change[:uri])


    when "created"


    process(change[:uri])


    end


    end


    end


    end

    View Slide

  83. class Index


    def synchronize(changes)


    changes.each do |change|


    case change[:type]


    when "deleted"


    remove_entries_for(change[:uri])


    when "created"


    process(change[:uri])


    when "changed"


    remove_entries_for(change[:uri])


    process(change[:uri])


    end


    end


    end


    end

    View Slide

  84. class Index


    def remove_entries_for(uri)


    path = URI(uri).path




    @knowledge.each do |const, loc|


    if loc[:file]
    =
    =
    path


    @knowledge.delete(const)


    end


    end


    end


    end

    View Slide

  85. We can now perform


    initial indexing and
    synchronization 🎉

    View Slide

  86. We’re ready to implement
    go to definition

    View Slide

  87. Implementing go to definition
    Part 3

    View Slide

  88. 1. Locate the target


    2. Find target in index


    3. Return declaration
    location

    View Slide

  89. class Definition < BaseRequest


    def initialize(document, position)


    @document = document


    @position = position


    end


    end

    View Slide

  90. class Definition < BaseRequest


    def run


    target = @document.locate_node(@position)


    file, location = case target


    end


    # …


    end


    end

    View Slide

  91. class Definition < BaseRequest


    def run


    target = @document.locate_node(@position)


    file, location = case target


    when SyntaxTree
    :
    :
    Const


    Index.instance.fetch(target.value)


    end


    # …


    end


    end

    View Slide

  92. class Definition < BaseRequest


    def run


    # …


    { uri: "file
    :
    /
    /
    #
    {
    file}"


    range: {


    start: {


    line: location.start_line,


    character: location.start_column


    },


    end: {


    line: location.end_line,


    character: location.end_column


    }


    }


    }


    end


    end

    View Slide

  93. Go to definition for
    classes is done 🎉

    View Slide

  94. This is not yet released
    in the Ruby LSP

    View Slide

  95. But I have a demo!

    View Slide

  96. View Slide

  97. Other features also
    depend on an index

    View Slide

  98. Signature help
    Workspace
    symbols
    Hover
    Autocomplete

    View Slide

  99. Signature help

    View Slide

  100. Signature help
    Workspace
    symbols
    Hover
    Autocomplete

    View Slide

  101. Workspace
    symbols

    View Slide

  102. Signature help
    Workspace
    symbols
    Hover
    Autocomplete

    View Slide

  103. Hover

    View Slide

  104. Signature help
    Workspace
    symbols
    Hover
    Autocomplete

    View Slide

  105. Autocomplete

    View Slide

  106. This approach is similar
    to what a typechecker does

    View Slide

  107. And there are still many
    improvements we can make

    View Slide

  108. • Use parallelism to build index
    • Cache the index
    • Include the missing parts (modules,
    constants, methods…)
    • Add the option to control what is
    indexed

    View Slide

  109. > lib
    > my_gem
    foo.rb
    > .ruby-lsp
    > lib
    > my_gem
    foo.indexed
    Last modified: 3 days ago
    Last modified: 1 minute ago

    View Slide

  110. # Gemfile


    gem "ruby_index"

    View Slide

  111. Multiple gems could reuse
    the index

    View Slide

  112. "ruby-lsp"
    "steep"
    "typeprof"
    "rdoc"
    "irb"

    View Slide

  113. Shopify/vscode-ruby-lsp


    Shopify/ruby-lsp
    Ruby LSP

    View Slide

  114. Developer experience is a
    part of making developers
    happy

    View Slide

  115. Let’s collaborate on
    improving it

    View Slide

  116. Thank you

    View Slide

  117. • https://github.com/Shopify/vscode-ruby-lsp


    • https://github.com/Shopify/ruby-lsp


    • https://microsoft.github.io/language-server-
    protocol/specification


    • Code font: Cascadia Code


    • Screenshots made with VS Code
    References

    View Slide