Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Adventures in the land of Language Servers

Adventures in the land of Language Servers

Have you ever wondered how your editors and IDEs are able to support so many programming languages? Perhaps you've been thinking about designing your own language and wanted to know how you can give it editor support?

This talk is for you - I've spent over a year building a small language and integrating it with code editors, and I'd like to share some of the challenges I've faced, as well as lessons I've learned in that time.

I'll also show how easy it is to build a new Language Server project in Scala 3 thanks to the Langoustine library.

Jakub Kozłowski

June 05, 2023
Tweet

More Decks by Jakub Kozłowski

Other Decks in Programming

Transcript

  1. Who's this talk for? If you... • Want to build

    tooling for your language • Want to understand more about your language tooling, or • Just want to learn
 
 
 This talk is for you
  2. Jane is tasked with building a language... • A small

    DSL meant to be used in an existing application • The application is JVM-based (Scala) • the compiler also needs to work on the JVM • 2-3 weeks later, the compiler is done and embedded in the application • Jane's life is good! • until...
  3. Editor support Users want a smooth editing experience • Analysis

    (error highlighting, go-to-de fi nition, ...) • Refactoring (rename, remove unused, ...) • Completions
  4. Editor support Users want a smooth editing experience • Analysis

    (error highlighting, go-to-de fi nition, ...) • Refactoring (rename, remove unused, ...) • Completions
  5. Editor support Options? • Writing a special editor ❌ •

    Users would have to learn a new editor • A lot of frontend work for a small language • Integrating with an existing editor ✅ • Which one(s)?
  6. Integrating with existing editors Users are asking for IntelliJ •

    IntelliJ is Java-based ✅ • Can easily be integrated with Jane's compiler • Still requires learning IntelliJ's extension API... • Some time later, the extension is up and running • Jane's life is good • until...
  7. VS Code support • VS Code is Node.js-based ❌ •

    interop with JVM-based tooling will be di ffi cult • Another extension API to learn 😩
  8. Node.js <-> JVM interop Options? • Rewrite the compiler •

    Transpile the compiler • Shell out ✅
  9. Jane decides to shell out 🐚 Client-server architecture • The

    editor extension (language frontend) will start a server in a separate process • A compilation server (language backend) will run on the JVM ☕ • The editor extension will send RPCs to the server ☎ • When the editor is closed, the extension will kill the server process 💀
  10. How it'll all work • The editor extension (language frontend)

    will: • Listen to user events 👂 (hover, clicks, keyboard shortcuts...) • Ask the server for information 🙋 (completions at position...) • Provide parameters 🧱 (cursor position, fi le contents...) • Apply actions and present results 🏋 (complete statement, go to de fi nition...) Client responsibilities
  11. How it'll all work • The language backend (compilation server)

    will: • Await requests from the extension ⏳ • Use parameters to analyze the fi les and derive information 🧐 • Respond to the client 🗣 Server responsibilities
  12. Jane needs an API for the communication • Jane wants

    to call it "Language Server Protocol", but a quick Google search tells her it's taken... • She decides to call it "Language Backend Protocol" ✅ • It'll use HTTP over local TCP ✅ trait LanguageBackend { def definition( cursor: Position, file: Path ): Location def rename( position: Position, file: Path, newName: String ): List[TextEdit] //... }
  13. Jane implements the changes • Implementing language backend • Implementing

    VS Code extension (client) • Adapting the IntelliJ extension (in-memory client) • Jane's life is good • until...
  14. Trish has a similar issue • Her team is maintaining

    their own language • They also got asked to support multiple editors • Trish is thinking of doing the same (a language backend) • She's asking for guidance • Jane gets an idea 💡
  15. The frontend knows (almost) nothing ...about the language • It

    knows what fi le extension the language uses • It knows how to launch the server, so... • The editors can talk to any language backend • Trish decides to use Jane's Language Backend Protocol ✅
  16. There's one problem... Trish's language needs different features • Trish

    needs support for "call hierarchy" • Jane can't implement that - the language has no functions (no hierarchy) • The problem can be generalized • Server features will become optional ✅ • Servers will describe their features to the client ✅
  17. Server capabilities A means for a server to describe what

    it can do • A new method in the protocol: initialize • Called by the client after server launches • The server responds with a list of supported methods • Unsupported features aren't advertised by the client ✅ • Servers and clients can add features at their own pace ✅ • Only features that make sense have to be implemented ✅
  18. Jane and Trish get to work... • Jane's server declares

    its capabilities • Trish's new methods are added to the protocol • Trish implements a language backend • The extensions are updated to hide unsupported features • The extensions can be con fi gured to launch di ff erent backends • Jane's life is good • until...
  19. Users can't see any feedback In the middle of an

    action • Some actions are inherently slow ⏳ • Sometimes you just need to tell the user to wait a little more, or ask them for con fi rmation • Only the client can initiate calls ↪ • The protocol doesn't give the server the ability to respond with intermediate results ↩
  20. Jane decides to add feedback The protocol needs changing •

    The backend must be able to call the client • HTTP is not enough • Jane decides to use JSON-RPC • Both sides can run requests on a single channel 🔄 (duplex) • Can use std I/O, TCP, WebSockets, ...
  21. A new API is created Requests to the client will

    use a new part of the protocol trait LanguageFrontend { def showMessage( tpe: "ERROR" | "WARNING" | "INFO", text: String ): IO[Unit] //... }
  22. Updates to the initialize request • Not every client can

    handle features like noti fi cations • Clients will advertise their capabilities in the initialize request • Servers will adjust behavior based on client capabilities trait LanguageBackend { def initialize( clientCapabilities: ClientCapabilities ): ServerCapabilities //... }
  23. Updates to the initialize request • Not every client can

    handle features like noti fi cations • Clients will advertise their capabilities in the initialize request • Servers will adjust behavior based on client capabilities trait LanguageBackend { def initialize( clientCapabilities: ClientCapabilities ): ServerCapabilities //... }
  24. Jane gets to work • The editor extensions are updated

    to advertise their capabilities • The server and extensions add support for the server -> client requests • Language backends can now provide estimations, noti fi cations, logs ✅ • Jane's life is good • until...
  25. Users can't do anything without saving the file Jane immediately

    knows the cause of the issue • The protocol identi fi es fi les by their disk path • Editors don't save fi les on each keystroke (disk I/O is slow) • The fi les aren't always up to date with the editor state ❌ • The protocol needs to account for unsaved fi les 💾 trait LanguageBackend { def definition( cursor: Position, file: Path ): Location def rename( position: Position, file: Path, newName: String ): List[TextEdit] //... }
  26. How to support unsaved files? • Passing fi le text

    in every request? • Asking the client for the latest text on-demand? • wasteful, non-incremental • What if there are many unsaved fi les? (e.g. no autosave) • Jane has an idea 💡
  27. Syncing text New methods in the protocol • New methods:

    onChanged/onSaved/onClosed • When the fi le changes, editor extension sends updates • These can be patches (if the server is capable) or entire fi les • The server will keep these in memory • When the fi le is closed/saved, the extension informs the server • The server can delete these from memory
  28. Bonus points Caching / pre-computing • The server is told

    about all fi le changes, so it can cache results of analysis • A change in one fi le could trigger diagnostics in other fi les • Diagnostics, completions etc. can be computed on fi le change, and before they're requested
  29. Jane and Trish get to work... • Servers will listen

    to events and use the in-memory fi le caches if available • Editor extensions will send fi le changes • Jane's life is good • until...
  30. Jane wants to open source the protocol • The idea

    is generic enough to handle many more languages • The community might integrate more editors • More features could be included in the editors
  31. Jane gets to work • Travels in time back to

    ~2015 • Joins Microsoft • Works on the Language Server Protocol • Jane's life is good!
  32. Language Server Protocol • A common speci fi cation for

    language features in editors and tools • Supported by most modern editors • JSON-RPC 2.0 (with headers) for communication • Bi-directional information exchange • Capability mechanism
  33. Parsing • Necessary to do virtually anything • Must keep

    track of locations/ranges in the input
  34. Parsing Graceful failure • Features should work even when parsing

    doesn't succeed • This is actually really hard
  35. Tree-sitter • Parser generator tool • Fast • Incremental •

    Graceful error handling • Bindings for WASM, Node, Rust, ∞ https://tree-sitter.github.io/tree-sitter
  36. More ideas for graceful parsing "we want parsing to always

    succeed at producing some kind of structured result. The result can contain error nodes inside it, but the error nodes don't have to replace the entire result"
 
 "(...) every string matches a rule, by adding rules for erroneous inputs" https://duriansoftware.com/joe/constructing-human-grade-parsers
  37. Formatting • Many ways to do this • Main problem

    (for me): keeping code comments • Keep them while parsing, or • Don't fully parse (work on token stream)
  38. Unit testing Pure logic - no LSP in sight •

    Fine-grained • Data-in, data-out, no state, no fi les • Follow generally accepted best practices
  39. Component testing e.g. testing an entire diagnostic request handler •

    Input: e.g. string, position • Output: LSP-like* • Still not actual LSP
  40. Integration testing Testing with LSP and a workspace • Doesn't

    have to involve JSON-RPC • No need to run in a separate process • De fi nitely want to use real fi les, state (if the server has any)
  41. Server End to end testing Covering everything the server is

    doing • Run the actual server process • Send requests with an LSP client
  42. End to end testing Includes an editor • E ff

    ectively, tests the extension • Launch editor, execute its commands, assert on result • 1 test per feature*editor, if you're paranoid • 1 test per editor if you're not
  43. Debugging Hints • Reproduce and minimize • For the fastest

    feedback loop, write a test • Use your server runtime's debugging tools • Check out Langoustine Tracer • ...println debugging
  44. Library choice • Standard implementations from M$ are in Node.js

    • Lsp4j for Java, libs for Haskell, Rust, ... • Until recently, no good library for Scala
  45. LSP in Scala Langoustine • Cross-platform, Scala- fi rst API,

    functional, no re fl ection, no mutability, auto- cancellation) • Easy setup with LangoustineApp • Still in the early stages Complete example of an LSP server ->
  46. Resources • Slides/contact: linktr.ee/kubukoz • My YouTube: yt.kubukoz.com • Example

    server: github.com/kubukoz/badlang • LSP website: microsoft.github.io/language-server-protocol • Langoustine: github.com/neandertech/langoustine • LSP history (old): github.com/microsoft/language-server-protocol/wiki/ Protocol-History