What is LSP = Language Server Protocol? ● Microsoft's private specification ● It is to enhance text editors to become like IDEs for programmers ● Protocol between editors and (programming) language toolchains
Who benefits from LSP? ● Text editor developers ○ They don't have to write code to "support for XYZ language" anymore ● Programming language toolchain developers ○ They don't have to write "XYZ editor plugin" anymore ● YOU! ● Microsoft
[Language Server] Protocol ● Traditional languages: compile souces to libraries or executables ● Modern languages (in certain culture): also provide libraries/tools/features to let developers programmatically compile and query states about the sources and referenced libraries. = works as a language service ● existing language servers (or alike): clang, Roslyn, typescript server
What to query Language Servers? ● "Which members of this `Foo` are available?" - Completion ● "Where is this `Bar` defined?" - Go To Definition ● "From which source lines is this `Baz` referenced/used?" - Find Usages etc.
Language Server [Protocol] ● communication between editors and language toolchains ● Why "protocol" ? ○ should not be exclusive to specific programming languages ○ If it is a C library then web-based editors won't support it (wasm? / text encoding?) ● uses JSON-RPC ○ gnome-code-assistance: depends on DBus = GNOME specific ● common subset from many languages
Language Server and Language Client ● Server: a (service-based) compiler ● Client: basically a text editor ● Available Clients ○ vscode, eclipse, emacs, vim, atom, VS ● Available Servers ○ "some" languages, but already too many to list.
Code completion textDocument/completion to query candidates, and completionItem/resolve to reflect completion onto the source code. textDocument/signatureHelp to get help for method arguments etc. completion triggers are fixed to dot '.' may be specified by server at `initialize` (may also be triggered by CTRL+SPACE etc.) signature help triggers are fixed to comma ',' also customizible.
Coding hints ● textDocument/definition - get locations for the symbol (typically) at cursor. ○ typically for "Go To Definition" ○ It can point to multiple source locations ● textDocument/typeDefinition - similar to `definition`, but for the "type" of the symbol at cursor. ○ for `val` (Kotlin), `var` (C#), `let` (TypeScript) etc. ● textDocument/hover - can be API definitions etc. but not limited to them.
And there are messages for editing too They are not interesting, but text editors need to send them to language servers. ● textDocument/didOpen ● textDocument/didChange ● textDocument/willSave ● textDocument/didSave ● textDocument/willClose ● textDocument/didClose ● workspace/applyEdit
... is it really time to implement my own LS? language server protocol fits with... ● simple-to-mid-level language ○ for too simple languages you wouldn't even need any editor support. ○ If you only need syntax highlighting, just use tmLanguage. ● it should be noted that LSP is a common subset of various languages. ○ "language servers and IDEs" - critical post on LSP (rust rls/KDevelop)
Dive into VSCode sources This time we only track vscode (because it is kind of normative reference). Microsoft/vscode on github. Language support is usually packaged as a "language extension".
Language extension types ● Extensions without language service (ignorable) ○ e.g. extension/css (has `css.tmLanguage.json` which works without LS) ● Extensions with language service ○ Direct implementation - uses only vscode API ○ LSP implementation - uses LS (`vscode-languageserver ` package) ■ e.g. extension/css-language-features Language Extension Guidelines - very nice documentation about them.
How vscode LS extensions are organized editor (host), language client, and language server. Server can be out-of-process. VSCode Language Feature Extension "client" entrypoint (JS) activate() { ... } package.json "main": "src/out/entrypoint" "server" implementation listen and notify on stdio/ipc
node packages to implement LS ● take a look at `src/extension/css-language-features` (as an example) ○ `client` // extension entrypoint (`activate()`), launches server ○ `server` // actual LS, typically standalone executable ■ import "vscode-languageserver" // LSP for node.js [repo] Microsoft/vscode-languageserver-node ■ import "vscode-css-languageservice" // actual CSS parser/compiler [repo] Microsoft/vscode-css-languageservice Server can be implemented in any language. You can use stdio. Useful example: [repo] Microsoft/vscode-languageserver-node-example
Implementing language services ● You are going to either implement a compiler, or make changes to existing one ● Compilation phases ○ parsing: build token trees ■ folding parser (#if etc.) ■ tokenizer/lexer and parser (syntax errors) ○ semantic analysis: build semantic trees ■ symbol resolution (check unknown identifiers) ■ validation (inheritance, type checking) ○ code generation - most unlikely relevant to language services
TIPs on implementing language server ● Implement your parser to always get precise location, and they have to be for both start and end of the tokens. ● Don't try too hard, start implementing from whichever you can. ○ e.g. compile everything whenever language feature is requested. ○ e.g. compile and update symbol locations only when sources are saved.
TIPs on implementing language server ● EASY: ○ `publishDiagnostics`: just compile and get errors, then report them ○ `documentSymbols`: get source code tree (AST) and return the locations. ● HARD: ○ `definition`: need to get AST, then find where "current token" is, resolve semantic tree, search symbol by identifier from semantic tree. ○ `completion`: similarly get semantic node from "current token", then return list of available members depending on context (e.g. static or instance).
Summary ● LSP is good for editor developers and language developers and their users. ● LSP provides various features e.g. completion, definition, references... ○ Yet the feature set is limited to common usages. ● You can implement LS in any language, adapt it in a language extension ○ You can use stdio.