Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Adventures in the land of Language Servers

Adventures in the land of Language Servers

Have you ever wondered how your editors and IDEs are able to support so many programming languages? Perhaps you've been thinking about designing your own language and wanted to know how you can give it editor support?

This talk is for you - I've spent over a year building a small language and integrating it with code editors, and I'd like to share some of the challenges I've faced, as well as lessons I've learned in that time.

I'll also show how easy it is to build a new Language Server project in Scala 3 thanks to the Langoustine library.

Jakub Kozłowski

June 05, 2023
Tweet

More Decks by Jakub Kozłowski

Other Decks in Programming

Transcript

  1. Jakub Kozłowski | Lambda Days | June 5, 2023
    Adventures in the land of
    Language Servers

    View Slide

  2. Who's this talk for?
    If you...
    • Want to build tooling for your language

    • Want to understand more about your language tooling, or

    • Just want to learn



    This talk is for you

    View Slide

  3. Storytime - meet Jane.


    👩💻

    View Slide

  4. Jane is tasked with building a language...
    • A small DSL meant to be used in an existing
    application

    • The application is JVM-based (Scala)

    • the compiler also needs to work on the JVM

    • 2-3 weeks later, the compiler is done and
    embedded in the application

    • Jane's life is good!

    • until...

    View Slide

  5. 😳 Jane starts getting users


    And they want editor support

    View Slide

  6. Editor support
    Users want a smooth editing experience
    • Analysis (error highlighting, go-to-de
    fi
    nition, ...)

    • Refactoring (rename, remove unused, ...)

    • Completions

    View Slide

  7. Editor support
    Users want a smooth editing experience
    • Analysis (error highlighting, go-to-de
    fi
    nition, ...)

    • Refactoring (rename, remove unused, ...)

    • Completions

    View Slide

  8. Editor support
    Options?
    • Writing a special editor ❌

    • Users would have to learn a new editor

    • A lot of frontend work for a small language

    • Integrating with an existing editor ✅

    • Which one(s)?

    View Slide

  9. Integrating with existing editors
    Users are asking for IntelliJ
    • IntelliJ is Java-based ✅

    • Can easily be integrated with Jane's compiler

    • Still requires learning IntelliJ's extension API...

    • Some time later, the extension is up and running

    • Jane's life is good
    • until...

    View Slide

  10. 😭 More users


    And this time they want VS Code

    View Slide

  11. VS Code support
    • VS Code is Node.js-based ❌

    • interop with JVM-based tooling will be di
    ffi
    cult

    • Another extension API to learn 😩

    View Slide

  12. Node.js <-> JVM interop
    Options?
    • Rewrite the compiler

    • Transpile the compiler

    • Shell out ✅

    View Slide

  13. Jane decides to shell out 🐚
    Client-server architecture
    • The editor extension (language frontend) will start a server in a separate
    process

    • A compilation server (language backend) will run on the JVM ☕

    • The editor extension will send RPCs to the server ☎

    • When the editor is closed, the extension will kill the server process 💀

    View Slide

  14. How it'll all work
    • The editor extension (language frontend) will:

    • Listen to user events 👂 (hover, clicks, keyboard shortcuts...)

    • Ask the server for information 🙋 (completions at position...)

    • Provide parameters 🧱 (cursor position,
    fi
    le contents...)

    • Apply actions and present results 🏋 (complete statement, go to
    de
    fi
    nition...)
    Client responsibilities

    View Slide

  15. How it'll all work
    • The language backend (compilation server) will:

    • Await requests from the extension ⏳

    • Use parameters to analyze the
    fi
    les and derive information 🧐

    • Respond to the client 🗣
    Server responsibilities

    View Slide

  16. Jane needs an API for the communication
    • Jane wants to call it "Language Server Protocol",
    but a quick Google search tells her it's taken...

    • She decides to call it "Language Backend
    Protocol" ✅

    • It'll use HTTP over local TCP ✅
    trait LanguageBackend {


    def definition(


    cursor: Position,


    file: Path


    ): Location


    def rename(


    position: Position,


    file: Path,


    newName: String


    ): List[TextEdit]


    //...

    }


    View Slide

  17. Jane implements the changes
    • Implementing language backend

    • Implementing VS Code extension (client)

    • Adapting the IntelliJ extension (in-memory client)

    • Jane's life is good

    • until...

    View Slide

  18. 🤔 Jane gets a DM


    Meet Trish

    View Slide

  19. Trish has a similar issue
    • Her team is maintaining their own language

    • They also got asked to support multiple editors

    • Trish is thinking of doing the same (a language backend)

    • She's asking for guidance

    • Jane gets an idea 💡

    View Slide

  20. The frontend knows (almost) nothing
    ...about the language
    • It knows what
    fi
    le extension the language uses

    • It knows how to launch the server, so...

    • The editors can talk to any language backend

    • Trish decides to use Jane's Language Backend Protocol ✅

    View Slide

  21. There's one problem...
    Trish's language needs different features
    • Trish needs support for "call hierarchy"

    • Jane can't implement that - the language has no functions (no hierarchy)
    • The problem can be generalized

    • Server features will become optional ✅

    • Servers will describe their features to the client ✅

    View Slide

  22. Server capabilities
    A means for a server to describe what it can do
    • A new method in the protocol: initialize

    • Called by the client after server launches

    • The server responds with a list of supported methods

    • Unsupported features aren't advertised by the client ✅

    • Servers and clients can add features at their own pace ✅

    • Only features that make sense have to be implemented ✅

    View Slide

  23. Jane and Trish get to work...
    • Jane's server declares its capabilities

    • Trish's new methods are added to the protocol

    • Trish implements a language backend

    • The extensions are updated to hide unsupported features

    • The extensions can be con
    fi
    gured to launch di
    ff
    erent backends

    • Jane's life is good
    • until...

    View Slide

  24. 😱 User complaint


    Some actions are visibly slow...

    View Slide

  25. Users can't see any feedback
    In the middle of an action
    • Some actions are inherently slow ⏳

    • Sometimes you just need to tell the user to wait a little more, or ask them for
    con
    fi
    rmation

    • Only the client can initiate calls ↪

    • The protocol doesn't give the server the ability to respond with intermediate
    results ↩

    View Slide

  26. Jane decides to add feedback
    The protocol needs changing
    • The backend must be able to call the client

    • HTTP is not enough

    • Jane decides to use JSON-RPC

    • Both sides can run requests on a single channel 🔄 (duplex)

    • Can use std I/O, TCP, WebSockets, ...

    View Slide

  27. A new API is created
    Requests to the client will use a new part of the protocol
    trait LanguageFrontend {


    def showMessage(


    tpe: "ERROR" | "WARNING" | "INFO",


    text: String


    ): IO[Unit]




    //...

    }


    View Slide

  28. Updates to the initialize request
    • Not every client can handle features like noti
    fi
    cations

    • Clients will advertise their capabilities in the initialize request

    • Servers will adjust behavior based on client capabilities
    trait LanguageBackend {


    def initialize(


    clientCapabilities: ClientCapabilities


    ): ServerCapabilities


    //...

    }


    View Slide

  29. Updates to the initialize request
    • Not every client can handle features like noti
    fi
    cations

    • Clients will advertise their capabilities in the initialize request

    • Servers will adjust behavior based on client capabilities
    trait LanguageBackend {


    def initialize(


    clientCapabilities: ClientCapabilities


    ): ServerCapabilities


    //...

    }


    View Slide

  30. Jane gets to work
    • The editor extensions are updated to advertise their capabilities

    • The server and extensions add support for the server -> client requests

    • Language backends can now provide estimations, noti
    fi
    cations, logs ✅

    • Jane's life is good

    • until...

    View Slide

  31. 💀 User complaint


    The editors are often breaking...

    View Slide

  32. Users can't do anything without saving the file
    Jane immediately knows the cause of the issue
    • The protocol identi
    fi
    es
    fi
    les by their disk path

    • Editors don't save
    fi
    les on each keystroke (disk I/O
    is slow)

    • The
    fi
    les aren't always up to date with the editor
    state ❌

    • The protocol needs to account for unsaved
    fi
    les 💾
    trait LanguageBackend {


    def definition(


    cursor: Position,


    file: Path


    ): Location


    def rename(


    position: Position,


    file: Path,


    newName: String


    ): List[TextEdit]


    //...

    }


    View Slide

  33. How to support unsaved files?
    • Passing
    fi
    le text in every request?

    • Asking the client for the latest text on-demand?

    • wasteful, non-incremental

    • What if there are many unsaved
    fi
    les? (e.g. no autosave)

    • Jane has an idea 💡

    View Slide

  34. Syncing text
    New methods in the protocol
    • New methods: onChanged/onSaved/onClosed

    • When the
    fi
    le changes, editor extension sends updates

    • These can be patches (if the server is capable) or entire
    fi
    les

    • The server will keep these in memory

    • When the
    fi
    le is closed/saved, the extension informs the server

    • The server can delete these from memory

    View Slide

  35. Bonus points
    Caching / pre-computing
    • The server is told about all
    fi
    le changes, so it can cache results of analysis

    • A change in one
    fi
    le could trigger diagnostics in other
    fi
    les

    • Diagnostics, completions etc. can be computed on
    fi
    le change, and before
    they're requested

    View Slide

  36. Jane and Trish get to work...
    • Servers will listen to events and use the in-memory
    fi
    le caches if available

    • Editor extensions will send
    fi
    le changes

    • Jane's life is good

    • until...

    View Slide

  37. 🥰 Giving back


    Why keep this to ourselves?

    View Slide

  38. Jane wants to open source the protocol
    • The idea is generic enough to handle many more languages

    • The community might integrate more editors

    • More features could be included in the editors

    View Slide

  39. Jane gets to work
    • Travels in time back to ~2015

    • Joins Microsoft

    • Works on the Language Server Protocol

    • Jane's life is good!

    View Slide

  40. Jane will return


    (Not really, this is the end)

    View Slide

  41. 🔄 LSP - a summary

    View Slide

  42. No LSP: M * N integrations
    Source: https://code.visualstudio.com/api/language-extensions/language-server-extension-guide

    View Slide

  43. Yes LSP: M + N integrations
    Source: https://code.visualstudio.com/api/language-extensions/language-server-extension-guide

    View Slide

  44. Language Server Protocol
    • A common speci
    fi
    cation for language features in editors and tools

    • Supported by most modern editors

    • JSON-RPC 2.0 (with headers) for communication

    • Bi-directional information exchange

    • Capability mechanism

    View Slide

  45. Example: go to definition
    https://github.com/kubukoz/badlang

    View Slide

  46. Go to definition in LSP

    View Slide

  47. Go to definition in LSP

    View Slide

  48. Go to definition in LSP

    View Slide

  49. Go to definition in LSP

    View Slide

  50. Go to definition in LSP

    View Slide

  51. 🤺 Challenges I've faced

    View Slide

  52. 🔎 Parsing

    View Slide

  53. Parsing
    Making text structured

    View Slide

  54. Parsing
    • Necessary to do virtually anything
    • Must keep track of locations/ranges in the input

    View Slide

  55. Parsing, more honest
    Ranges included

    View Slide

  56. Parsing test harness
    Derive tests from directory structure

    View Slide

  57. Parsing test harness
    Derive tests from directory structure

    View Slide

  58. Parsing
    Graceful failure
    • Features should work even when parsing doesn't succeed

    • This is actually really hard

    View Slide

  59. Tree-sitter
    • Parser generator tool

    • Fast

    • Incremental

    • Graceful error handling

    • Bindings for WASM, Node,
    Rust, ∞
    https://tree-sitter.github.io/tree-sitter

    View Slide

  60. More ideas for graceful parsing
    "we want parsing to always succeed at producing some kind of
    structured result. The result can contain error nodes inside it, but the
    error nodes don't have to replace the entire result"


    "(...) every string matches a rule, by adding rules for erroneous inputs"
    https://duriansoftware.com/joe/constructing-human-grade-parsers

    View Slide

  61. 🖨 Formatting

    View Slide

  62. Formatting
    • Many ways to do this

    • Main problem (for me): keeping code comments

    • Keep them while parsing, or

    • Don't fully parse (work on token stream)

    View Slide

  63. Formatting
    Tests - you're gonna need these
    https://github.com/kubukoz/smithy-playground (FormattingTests.scala)

    View Slide

  64. 🔬 Testing

    View Slide

  65. Testing
    The pyramid still applies

    View Slide

  66. Unit testing
    Pure logic - no LSP in sight
    • Fine-grained

    • Data-in, data-out, no state, no
    fi
    les

    • Follow generally accepted
    best practices

    View Slide

  67. Component testing
    e.g. testing an entire diagnostic request handler
    • Input: e.g. string, position
    • Output: LSP-like*

    • Still not actual LSP

    View Slide

  68. Integration testing
    Testing with LSP and a workspace
    • Doesn't have to involve JSON-RPC

    • No need to run in a separate process

    • De
    fi
    nitely want to use real
    fi
    les, state (if the server has any)

    View Slide

  69. Server End to end testing
    Covering everything the server is doing
    • Run the actual server process

    • Send requests with an LSP client

    View Slide

  70. End to end testing
    Includes an editor
    • E
    ff
    ectively, tests the extension

    • Launch editor, execute its commands,
    assert on result

    • 1 test per feature*editor, if you're paranoid

    • 1 test per editor if you're not

    View Slide

  71. 🐞 Debugging

    View Slide

  72. Debugging
    Hints
    • Reproduce and minimize

    • For the fastest feedback loop, write a test

    • Use your server runtime's debugging tools

    • Check out Langoustine Tracer

    • ...println debugging

    View Slide

  73. Debugging

    View Slide

  74. View Slide

  75. 📚 Library choice

    View Slide

  76. Library choice
    • Standard implementations from M$ are in Node.js

    • Lsp4j for Java, libs for Haskell, Rust, ...

    • Until recently, no good library for Scala

    View Slide

  77. LSP in Scala
    Langoustine
    • Cross-platform, Scala-
    fi
    rst API, functional,
    no re
    fl
    ection, no mutability, auto-
    cancellation)

    • Easy setup with LangoustineApp
    • Still in the early stages
    Complete example of an LSP server ->

    View Slide

  78. Resources
    • Slides/contact: linktr.ee/kubukoz

    • My YouTube: yt.kubukoz.com

    • Example server: github.com/kubukoz/badlang

    • LSP website: microsoft.github.io/language-server-protocol

    • Langoustine: github.com/neandertech/langoustine

    • LSP history (old): github.com/microsoft/language-server-protocol/wiki/
    Protocol-History

    View Slide