Slide 1

Slide 1 text

1 Steve Gordon (Senior Engineer @ Elastic) @stevejgordon | stevejgordon.co.uk The New Elasticsearch .NET Client Getting Started and Behind the Scenes

Slide 2

Slide 2 text

2 Agenda • Introduction to Elasticsearch • .NET Client for Elasticsearch ‒ Problems with the existing client ‒ Introducing the new v8 client • Demos • Behind the scenes ‒ Building the Elasticsearch specification ‒ Code generation of the new .NET client

Slide 3

Slide 3 text

We build search solutions on a single stack Enterprise Search Observability Security

Slide 4

Slide 4 text

4 Store, Search, & Analyze Visualize & Manage Ingest Elastic Stack SOLUTIONS Kibana Elasticsearch Beats Logstash SaaS On-Prem Elastic cloud Elastic cloud Enterprise Standalone Elastic cloud On Kubernetes Elastic Agent

Slide 5

Slide 5 text

5 CLOUD DEMO

Slide 6

Slide 6 text

6 Elasticsearch Terminology

Slide 7

Slide 7 text

7 Basic Terminology CLUSTER A collection of one or more nodes (servers) that together hold your data and provide federated indexing and search capabilities across all nodes.

Slide 8

Slide 8 text

8 Basic Terminology CLUSTER A single server that is part of your cluster, stores your data, and participates in the clusters indexing and search capabilities. NODE 1 NODE 2 NODE 3

Slide 9

Slide 9 text

9 Basic Terminology CLUSTER A collection of documents that have somewhat similar characteristics. NODE 1 NODE 2 NODE 3 INDEX

Slide 10

Slide 10 text

10 Basic Terminology CLUSTER Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2

Slide 11

Slide 11 text

11 Basic Terminology CLUSTER Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2

Slide 12

Slide 12 text

12 Basic Terminology CLUSTER The basic unit of information that can be indexed in JSON form. NODE 1 NODE 2 NODE 3 INDEX SHARD (PRIMARY) P1 SHARD (PRIMARY) P2 SHARD (PRIMARY) P3 SHARD (REPLICA) R3 SHARD (REPLICA) R1 SHARD (REPLICA) R2 DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC DOC

Slide 13

Slide 13 text

13 13 HTTP Interface

Slide 14

Slide 14 text

14 The Elasticsearch API in Numbers • > 400 API endpoints • > 2000 data structures ‒ 50 query types ‒ 70 aggregation types ‒ 30 field types

Slide 15

Slide 15 text

15 Language Clients • .NET • Java • JavaScript • Ruby • Go • PHP • Perl • Python • Rust

Slide 16

Slide 16 text

16 Existing Elasticsearch .NET Client (7.x) NEST High Level Client • Methods for every API • Strongly-typed requests & responses • Aggregations • Mappings • Query DSL • Fluent syntax • Helpers Elasticsearch.Net Low Level Client • Dependency free, unopinionated client. • Handles transport • Client-side load balancing • Request parameters (query string) • Serialisation • API URL Resolution

Slide 17

Slide 17 text

17 Architecture of Existing .NET Client NEST Elasticsearch.NET HttpWebRequest or HttpClient Elasticsearch Server HTTP

Slide 18

Slide 18 text

18 Problems with the Existing Client • Hand written ‒ API is not always consistent ‒ A lot of maintenance work (400 endpoints and thousands of types!) • Legacy internalised JSON serialiser based on Utf8Json • ~12 years of historical decisions

Slide 19

Slide 19 text

19

Slide 20

Slide 20 text

20 Introducing Elastic.Clients.Elasticsearch • A new generation of the Elasticsearch client • Code Generated ‒ Based on a formal specification of the Elasticsearch API • Uses System.Text.Json serializer • Built on a common Elastic.Transport layer • Removes some of the legacy of the past to create a cleaner API The new .NET client for v8.0

Slide 21

Slide 21 text

21 Architecture of New .NET Client Elastic.Clients.Elasticsearch Elastic.Transport HttpWebRequest or HttpClient Elasticsearch Server HTTP Other Clients

Slide 22

Slide 22 text

22 DEMOS

Slide 23

Slide 23 text

23 Building the Elasticsearch API specification

Slide 24

Slide 24 text

24 REST API specification → OpenAPI? OpenAPI is too limited • Elasticsearch API is complex and not “canonical” • Would require custom extensions • Our problem is mostly about data structures, not so much URLs OpenAPI is complex • “The Schema Object is a superset of the JSON Schema Specification Draft 2020-12” 😱 • 400 endpoint, 2000 structures… in YAML/JSON 😓

Slide 25

Slide 25 text

25 JSON API Specification → TypeScript! TypeScript’s type system is built to represent JSON/JS • Static type checking of the API • Strong IDE support • ts-morph: a library to build TS code processors ‒ Setup, navigation, and manipulation of the TypeScript AST can be a challenge. This library wraps the TypeScript compiler API so it's simple.

Slide 26

Slide 26 text

27 Example: Search Request /** * @rest_spec_name search * @since 0.0.0 * @stability stable */ export interface Request extends RequestBase { path_parts: { index?: Indices } query_parameters: { allow_no_indices?: Boolean ... size?: integer from?: integer sort?: string | string[] } body: { /** @aliases aggs */ // ES uses "aggregations" in serialization aggregations?: Dictionary collapse?: FieldCollapse /** * If true, returns detailed information about score computation as part of a hit. * @server_default false */ explain?: boolean * @server_default 0 */ from?: integer ... } export type IndexName = string export type Indices = IndexName | IndexName[] Meta information Alias tag Documentation comment

Slide 27

Slide 27 text

28 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile pit_id?: Id _scroll_id?: ScrollId suggest?: Dictionary[]> terminated_early?: boolean } Example: Search Response User-provided type

Slide 28

Slide 28 text

29 export class Response { body: ResponseBody } export class ResponseBody { took: long timed_out: boolean _shards: ShardStatistics hits: HitsMetadata aggregations?: Dictionary _clusters?: ClusterStatistics fields?: Dictionary max_score?: double num_reduce_phases?: long profile?: Profile pit_id?: Id _scroll_id?: ScrollId suggest?: Dictionary[]> terminated_early?: boolean } Example: Search Response export class HitsMetadata { total?: TotalHits | long hits: Hit[] max_score?: double | null } export class HitMetadata { _id: Id _index: IndexName _primary_term: long _routing: string _seq_no: SequenceNumber _source: TDocument _version: VersionNumber }

Slide 29

Slide 29 text

30 Example: Query /** * @variants container * @non_exhaustive * @doc_id query-dsl */ export class QueryContainer { bool?: BoolQuery boosting?: BoostingQuery /** @deprecated 7.3.0 */ common?: SingleKeyDictionary /** @since 7.13.0 */ combined_fields?: CombinedFieldsQuery constant_score?: ConstantScoreQuery dis_max?: DisMaxQuery distance_feature?: DistanceFeatureQuery exists?: ExistsQuery function_score?: FunctionScoreQuery fuzzy?: SingleKeyDictionary geo_bounding_box?: GeoBoundingBoxQuery geo_distance?: GeoDistanceQuery geo_polygon?: GeoPolygonQuery ... Container variant is used for types that contain all the variants inside the definition Properties can be tagged as deprecation since a particular version We also track versions where new properties have been added

Slide 30

Slide 30 text

31 Validating the Specification • Piggy-back on Elasticsearch integration tests ‒ Capture request and response JSON ‒ Does it fit in the corresponding TS type? ‒ > 5400 validation tests!

Slide 31

Slide 31 text

32 Validating the Specification

Slide 32

Slide 32 text

33 Code Generation

Slide 33

Slide 33 text

34 TypeScript to Code Generating code from the TypeScript AST • Too low level • Not constrained enough Transform TypeScript to a simpler schema • Tailor-made for Elastic’s specific needs • Simple unambiguous meta-model

Slide 34

Slide 34 text

35 Code Generation Pipeline Spec compiler schema.json TypeScript API request & response bodies specification.ts Endpoints, Request & response bodies + Rich annotations .NET Code Generator .NET Client More Code Generators Java, Go, JS, Python, Rust, Ruby, PHP clients OpenAPI OpenAPI API Docs Even more generators

Slide 35

Slide 35 text

36 .NET Code Generator Process Deserialise JSON Build contexts Mark and enrich contexts Build Roslyn AST Write .cs files

Slide 36

Slide 36 text

37 .NET Code Generator • Establish naming and namespaces for generated types • Walk type hierarchy • Identify relationships • Mark request types • Mark containers and variants • Simplify type aliases to built-in types • Determine which types require which descriptors • Mark specialised serialisation needs (Bulk etc.) Marking and Enrichment

Slide 37

Slide 37 text

38 .NET Code Generator • Roslyn includes a very rich API • Not a great deal of documentation • roslynquoter.azurewebsites.net Build Roslyn AST https://docs.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/compiler-api-model

Slide 38

Slide 38 text

39 Generator Code

Slide 39

Slide 39 text

40 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 40

Slide 40 text

41 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 41

Slide 41 text

42 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 42

Slide 42 text

43 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 43

Slide 43 text

44 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 44

Slide 44 text

45 // ** REQUEST PARAMETERS var requestParametersClass = ClassDeclaration(request.RequestParametersName) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.SealedKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"RequestParameters<{request.RequestParametersName}>"))) .AddMembers(request.QueryStringParameters.Select(a => a.QueryParameterProperty()) .Where(p => p is not null).ToArray()); AddClass(requestParametersClass); var (constructors, descriptorConstructors) = CreateConstructors(); // ** REQUEST CLASS var requestClass = ClassDeclaration(request.TypeInfo.Name) .AddModifiers(Token(SyntaxKind.PublicKeyword), Token(SyntaxKind.PartialKeyword)) .AddBaseListTypes(SimpleBaseType(ParseName($"PlainRequestBase<{request.RequestParametersName}>"))) .AddMembers(constructors.ToArray()) .AddMembers(request.GetCommonRequestProperties()) .AddMembers(request.GenericArguments.Where(x => x.Name == "TDocument") .Select(p => p.GenericPropertySyntax()).Where(p => p is not null).ToArray()) .AddMembers(request.QueryStringParameters.Select(p => p.QueryStringProperty()) .Where(p => p is not null).ToArray()) .AddMembers(request.BodyProperties.Select(p => p.SerializablePropertySyntax()) .Where(p => p is not null).ToArray()); ...

Slide 45

Slide 45 text

46 public static PropertyDeclarationSyntax CreateSerializableProperty(PropertyV2 property, bool selfDeserialisable = false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration(typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); if (property.IsSourceProperty) propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("SourceConverter"))))); ...

Slide 46

Slide 46 text

47 public static PropertyDeclarationSyntax CreateSerializableProperty(PropertyV2 property, bool selfDeserialisable = false) { if (TryResolveMemberToPropertyType(property.Type, property, out var typeSyntax)) { var propertyDeclaration = PropertyDeclaration(typeSyntax.TypeSyntax, Identifier(property.CodegenName)) .AddModifiers(Token(SyntaxKind.PublicKeyword)); propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonInclude")))), AttributeList(SingletonSeparatedList(Attribute(IdentifierName("JsonPropertyName")) .AddArgumentListArguments(AttributeArgument( LiteralExpression(SyntaxKind.StringLiteralExpression, Literal(property.JsonName))))))); if (property.IsSourceProperty) propertyDeclaration = propertyDeclaration.AddAttributeLists( AttributeList(SingletonSeparatedList(Attribute(IdentifierName("SourceConverter"))))); ...

Slide 47

Slide 47 text

48 ... if (property.OwningType.UsedInRequest) { propertyDeclaration = propertyDeclaration.AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.SetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken))); } else { propertyDeclaration = propertyDeclaration.AddAccessorListAccessors( AccessorDeclaration(SyntaxKind.GetAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken)), AccessorDeclaration(SyntaxKind.InitAccessorDeclaration) .WithSemicolonToken(Token(SyntaxKind.SemicolonToken))); } return propertyDeclaration; } return null; }

Slide 48

Slide 48 text

49 Resources • github.com/elastic/elasticsearch-net • elastic.co/guide/en/elasticsearch/client/net-api • nuget.org/packages/NEST • nuget.org/packages/Elastic.Clients.Elasticsearch • github.com/stevejgordon/elasticsearch-examples • github.com/elastic/elasticsearch-specification • discuss.elastic.co

Slide 49

Slide 49 text

50 Thank you! @stevejgordon