Slide 1

Slide 1 text

1 Steve Gordon (Engineer @ Elastic) @stevejgordon | fosstodon.org/@stevejgordon stevejgordon.co.uk Writing Code with Code Getting Started with the Roslyn APIs bit.ly/writing-code-with-code

Slide 2

Slide 2 text

2 Agenda • Introduction to Roslyn • Roslyn API demos ‒ Visualising syntax trees ‒ Generating C# code from syntax trees • Code-generating the Elasticsearch .NET client ‒ Creating a specification/schema ‒ Transform a spec to a strongly-type language ‒ Building syntax trees ‒ Emitting C# files ‒ Future enhancements (lessons learned)

Slide 3

Slide 3 text

3 Roslyn (.NET Compiler Platform SDK)

Slide 4

Slide 4 text

4 What is Rosyln? • Open source, open box, compilers for C# and VB.NET • Compiler platform • Used heavily to provide Visual Studio IDE capabilities ‒ Maker of squiggles!! ‒ Finder of things!!

Slide 5

Slide 5 text

5 Analyzers and Code Fixes • An analyzer contains code that recognizes violations of a rule • Rules can relate to code structure, coding style, naming conventions etc. • A code fix contains the code that fixes the violation

Slide 6

Slide 6 text

6 Demo (Analyzers)

Slide 7

Slide 7 text

7 Source Generators • C# compiler feature that lets developers inspect user code as it is being compiled • Develop components which run during compilation with access to rich metadata • Can create new C# source files on the fly that are added to a compilation

Slide 8

Slide 8 text

8 Compiler Flow • C#, VB.NET and F# compile to IL • At runtime, IL code is Just-In-Time (JIT) compiled to machine code • AoT / Native compilation (No JIT required) https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model

Slide 9

Slide 9 text

9 Rosyln APIs • Roslyn includes a very rich compiler API https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model

Slide 10

Slide 10 text

10 Rosyln APIs • Roslyn includes a very rich compiler API https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model

Slide 11

Slide 11 text

11 Syntax Trees • The result of the syntax analysis phase of a compiler • Tree representation of syntactic structure of source code ‒ Nodes, Tokens and Trivia • Interact with source code on a deeply meaningful level. It's no longer text strings, but data that represents the structure of a program • Immutable and round-trippable data structure exposed by the compiler APIs

Slide 12

Slide 12 text

12 IDE Tooling (Visualising Syntax Trees)

Slide 13

Slide 13 text

13

Slide 14

Slide 14 text

14 Demo (Visualising Syntax Trees)

Slide 15

Slide 15 text

15 Elasticsearch Language Client Code Generation

Slide 16

Slide 16 text

16 The Elasticsearch API in Numbers • > 400 API endpoints • > 2000 data structures ‒ 50 query types ‒ 70 aggregation types ‒ 30 field types

Slide 17

Slide 17 text

17 Problems with the v7 Client • Hand written ‒ API is not always consistent ‒ A lot of maintenance work (400 endpoints and thousands of types!)

Slide 18

Slide 18 text

18 Elastic.Clients.Elasticsearch • A new generation of the Elasticsearch client • Code Generated ‒ Based on a formal specification of the Elasticsearch API The new .NET client for v8.0

Slide 19

Slide 19 text

19 Options for Code Generation • Strings/StringBuilder • Templates (T4, Razor etc.) • Roslyn APIs ‒ Construct syntax tree and produce C#

Slide 20

Slide 20 text

20 Demo (Generating C#)

Slide 21

Slide 21 text

21 Demo (Roslyn Quoter)

Slide 22

Slide 22 text

22 Generating the .NET Client

Slide 23

Slide 23 text

23 Building the Elasticsearch API specification

Slide 24

Slide 24 text

24 REST API specification → OpenAPI? OpenAPI is too limited • Elasticsearch API is complex and not “canonical” • Would require custom extensions • Our problem is mostly about data structures, not so much URLs OpenAPI is complex • “The Schema Object is a superset of the JSON Schema Specification Draft 2020-12” 😱 • 400 endpoint, 2000 structures… in YAML/JSON 😓

Slide 25

Slide 25 text

25 JSON API Specification → TypeScript! TypeScript’s type system is built to represent JSON/JS • Static type checking of the API • Strong IDE support • ts-morph: a library to build TS code processors ‒ Setup, navigation, and manipulation of the TypeScript AST can be a challenge. This library wraps the TypeScript compiler API so it's simple.

Slide 26

Slide 26 text

27 Example: Search Request /** * @rest_spec_name search * @since 0.0.0 * @stability stable */ export interface Request extends RequestBase { path_parts: { index?: Indices } query_parameters: { allow_no_indices?: Boolean ... size?: integer from?: integer sort?: string | string[] } ... export type IndexName = string export type Indices = IndexName | IndexName[] Meta information

Slide 27

Slide 27 text

28 Example: Search Request ... body: { /** @aliases aggs */ aggregations?: Dictionary collapse?: FieldCollapse /** * If true, returns detailed information about score computation as part of a hit. * @server_default false */ explain?: boolean from?: integer ... } Alias tag Documentation comment

Slide 28

Slide 28 text

30 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response User-provided type

Slide 29

Slide 29 text

31 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata { total?: TotalHits | long hits: Hit[] max_score?: double | null } export class Hit { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }

Slide 30

Slide 30 text

32 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata { total?: TotalHits | long hits: Hit[] max_score?: double | null } export class Hit { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }

Slide 31

Slide 31 text

33 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile ... } Example: Search Response export class HitsMetadata { total?: TotalHits | long hits: Hit[] max_score?: double | null } export class Hit { _index: IndexName _id: Id _score?: double | null _explanation?: Explanation ... _source: TDocument _seq_no?: SequenceNumber _source: TDocument _version?: VersionNumber }

Slide 32

Slide 32 text

37 Example: Query /** * @variants container * @non_exhaustive * @doc_id query-dsl */ export class QueryContainer { bool?: BoolQuery boosting?: BoostingQuery /** @deprecated 7.3.0 */ common?: SingleKeyDictionary /** @since 7.13.0 */ combined_fields?: CombinedFieldsQuery constant_score?: ConstantScoreQuery dis_max?: DisMaxQuery distance_feature?: DistanceFeatureQuery exists?: ExistsQuery function_score?: FunctionScoreQuery ... Container variant is used for types that contain all the variants inside the definition

Slide 33

Slide 33 text

38 Validating the Specification • Piggy-back on Elasticsearch integration tests ‒ Capture request and response JSON ‒ Does it fit in the corresponding TS type? ‒ > 5400 validation tests!

Slide 34

Slide 34 text

39 Validating the Specification

Slide 35

Slide 35 text

40 Generating from the Specification

Slide 36

Slide 36 text

41 TypeScript to Code Generating code from the TypeScript AST • Too low level • Not constrained enough Transform TypeScript to a simpler schema • Tailor-made for Elastic’s specific needs • Simple unambiguous meta-model

Slide 37

Slide 37 text

42 Code Generation Pipeline Spec compiler schema.json TypeScript API request & response bodies specification.ts Endpoints, Request & response bodies + Rich annotations .NET Code Generator .NET Client More Code Generators Java, Go, JS, Python, Rust, Ruby, PHP clients OpenAPI Generator OpenAPI Spec API Docs Even more generators

Slide 38

Slide 38 text

43 schema.json Demo

Slide 39

Slide 39 text

44 .NET Code Generator Process Deserialise JSON Build contexts Mark and enrich contexts Build Roslyn Syntax Trees Write .cs files

Slide 40

Slide 40 text

45 .NET Code Generator • Establish naming and namespaces for generated types • Walk type hierarchy • Identify relationships • Mark request types • Mark containers and variants • Simplify type aliases to built-in types • Mark specialised serialisation needs (Bulk etc.) Marking and Enrichment

Slide 41

Slide 41 text

46 Generator Code

Slide 42

Slide 42 text

48 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters .Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); ...

Slide 43

Slide 43 text

49 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 44

Slide 44 text

50 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 45

Slide 45 text

51 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 46

Slide 46 text

52 var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers( Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes( SimpleBaseType( ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()) .Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters .Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties .Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 47

Slide 47 text

58 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable = false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...

Slide 48

Slide 48 text

59 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable = false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...

Slide 49

Slide 49 text

60 public static PropertyDeclarationSyntax CreateSerializableProperty( Property property, bool selfDeserialisable = false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration( typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList( Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); ...

Slide 50

Slide 50 text

64 if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration;

Slide 51

Slide 51 text

65 if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration .AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken( Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration;

Slide 52

Slide 52 text

66 Next Steps

Slide 53

Slide 53 text

67 Future Plans • Refactoring the Code Generator ‒ Pluggable transform pipeline (JSON input) ‒ Pluggable filter pipeline for endpoints (JSON input) ‒ Easier to configure for non-developers ‒ Decouple specification from syntax building (intermediate models) • Analyse existing project via Workspace APIs ‒ Determine differences and breaking changes ‒ Check generated project compiles (in memory)

Slide 54

Slide 54 text

68 Resources • bit.ly/writing-code-with-code • github.com/stevejgordon/writing-code-with-code- demos • learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/ • roslynquoter.azurewebsites.net/ • github.com/elastic/elasticsearch-net • github.com/elastic/elasticsearch-specification

Slide 55

Slide 55 text

69 Thank you! @stevejgordon