[COSCUP2018] Language Server Protocol explained

24837993455f54c957883ba1f1db7f2d?s=47 Atsushi Eno
August 11, 2018

[COSCUP2018] Language Server Protocol explained

- at COSCUP 2018 (future presentation, subect to change)

24837993455f54c957883ba1f1db7f2d?s=128

Atsushi Eno

August 11, 2018
Tweet

Transcript

  1. 2.

    What is LSP = Language Server Protocol? • Microsoft's private

    specification • It is to enhance text editors to become like IDEs for programmers • Protocol between editors and (programming) language toolchains
  2. 3.

    Who benefits from LSP? • Text editor developers ◦ They

    don't have to write code to "support for XYZ language" anymore • Programming language toolchain developers ◦ They don't have to write "XYZ editor plugin" anymore • YOU! • Microsoft
  3. 4.

    [Language Server] Protocol • Traditional languages: compile souces to libraries

    or executables • Modern languages (in certain culture): also provide libraries/tools/features to let developers programmatically compile and query states about the sources and referenced libraries. = works as a language service • existing language servers (or alike): clang, Roslyn, typescript server
  4. 5.

    What to query Language Servers? • "Which members of this

    `Foo` are available?" - Completion • "Where is this `Bar` defined?" - Go To Definition • "From which source lines is this `Baz` referenced/used?" - Find Usages etc.
  5. 6.

    Language Server [Protocol] • communication between editors and language toolchains

    • Why "protocol" ? ◦ should not be exclusive to specific programming languages ◦ If it is a C library then web-based editors won't support it (wasm? / text encoding?) • uses JSON-RPC ◦ gnome-code-assistance: depends on DBus = GNOME specific • common subset from many languages
  6. 7.

    Language Server and Language Client • Server: a (service-based) compiler

    • Client: basically a text editor • Available Clients ◦ vscode, eclipse, emacs, vim, atom, VS • Available Servers ◦ "some" languages, but already too many to list.
  7. 8.

    LSP messages • Two kinds of roles: ◦ source management

    ▪ notify editor buffer changes to language server ◦ language features ▪ interesting part
  8. 10.

    Code completion textDocument/completion to query candidates, and completionItem/resolve to reflect

    completion onto the source code. textDocument/signatureHelp to get help for method arguments etc. completion triggers are fixed to dot '.' may be specified by server at `initialize` (may also be triggered by CTRL+SPACE etc.) signature help triggers are fixed to comma ',' also customizible.
  9. 12.

    Global search workspace/symbol could be used to implement "global search"

    feature. "find anything that matches in the project"
  10. 13.

    Coding hints • textDocument/definition - get locations for the symbol

    (typically) at cursor. ◦ typically for "Go To Definition" ◦ It can point to multiple source locations • textDocument/typeDefinition - similar to `definition`, but for the "type" of the symbol at cursor. ◦ for `val` (Kotlin), `var` (C#), `let` (TypeScript) etc. • textDocument/hover - can be API definitions etc. but not limited to them.
  11. 14.

    Too many features to show here (in 30 min.) •

    completion, /resolve • hover • signatureHelp • definition • typeDefinition • implementation • references • documentHighlight • documentSymbol • codeAction • codeLens, /resolve • documentLink, /resolve • documentColor • colorPresentation • formatting • rangeFormatting • onTypeFormatting • rename • foldingRange • symbol • publishDiagnostics
  12. 15.

    And there are messages for editing too They are not

    interesting, but text editors need to send them to language servers. • textDocument/didOpen • textDocument/didChange • textDocument/willSave • textDocument/didSave • textDocument/willClose • textDocument/didClose • workspace/applyEdit
  13. 18.

    ... is it really time to implement my own LS?

    language server protocol fits with... • simple-to-mid-level language ◦ for too simple languages you wouldn't even need any editor support. ◦ If you only need syntax highlighting, just use tmLanguage. • it should be noted that LSP is a common subset of various languages. ◦ "language servers and IDEs" - critical post on LSP (rust rls/KDevelop)
  14. 19.

    Dive into VSCode sources This time we only track vscode

    (because it is kind of normative reference). Microsoft/vscode on github. Language support is usually packaged as a "language extension".
  15. 20.

    Language extension types • Extensions without language service (ignorable) ◦

    e.g. extension/css (has `css.tmLanguage.json` which works without LS) • Extensions with language service ◦ Direct implementation - uses only vscode API ◦ LSP implementation - uses LS (`vscode-languageserver ` package) ▪ e.g. extension/css-language-features Language Extension Guidelines - very nice documentation about them.
  16. 21.

    How vscode LS extensions are organized editor (host), language client,

    and language server. Server can be out-of-process. VSCode Language Feature Extension "client" entrypoint (JS) activate() { ... } package.json "main": "src/out/entrypoint" "server" implementation listen and notify on stdio/ipc
  17. 22.

    node packages to implement LS • take a look at

    `src/extension/css-language-features` (as an example) ◦ `client` // extension entrypoint (`activate()`), launches server ◦ `server` // actual LS, typically standalone executable ▪ import "vscode-languageserver" // LSP for node.js [repo] Microsoft/vscode-languageserver-node ▪ import "vscode-css-languageservice" // actual CSS parser/compiler [repo] Microsoft/vscode-css-languageservice Server can be implemented in any language. You can use stdio. Useful example: [repo] Microsoft/vscode-languageserver-node-example
  18. 23.

    Implementing language services • You are going to either implement

    a compiler, or make changes to existing one • Compilation phases ◦ parsing: build token trees ▪ folding parser (#if etc.) ▪ tokenizer/lexer and parser (syntax errors) ◦ semantic analysis: build semantic trees ▪ symbol resolution (check unknown identifiers) ▪ validation (inheritance, type checking) ◦ code generation - most unlikely relevant to language services
  19. 24.

    TIPs on implementing language server • Implement your parser to

    always get precise location, and they have to be for both start and end of the tokens. • Don't try too hard, start implementing from whichever you can. ◦ e.g. compile everything whenever language feature is requested. ◦ e.g. compile and update symbol locations only when sources are saved.
  20. 25.

    TIPs on implementing language server • EASY: ◦ `publishDiagnostics`: just

    compile and get errors, then report them ◦ `documentSymbols`: get source code tree (AST) and return the locations. • HARD: ◦ `definition`: need to get AST, then find where "current token" is, resolve semantic tree, search symbol by identifier from semantic tree. ◦ `completion`: similarly get semantic node from "current token", then return list of available members depending on context (e.g. static or instance).
  21. 26.

    Summary • LSP is good for editor developers and language

    developers and their users. • LSP provides various features e.g. completion, definition, references... ◦ Yet the feature set is limited to common usages. • You can implement LS in any language, adapt it in a language extension ◦ You can use stdio.
  22. 27.