Graph vs Graph

6fe43e0038cf0e5579b549d417d4f3ec?s=47 Rhys Evans
September 06, 2019

Graph vs Graph

Most stories of GraphQL implementations focus on retrofitting a GraphQL layer on top of existing APIs to surface the data's graphiness. But what if you're starting from scratch and you know you'll need a graph from day 1 - what role is there for GraphQL when it's not the only graph representation in your stack. I'll tell you how, at the FT, we've combined neo4j graph database with GraphQL to expose operational information about our business like never before.

6fe43e0038cf0e5579b549d417d4f3ec?s=128

Rhys Evans

September 06, 2019
Tweet

Transcript

  1. Graph vs Graph GraphQL as the API for your graph

    database Rhys Evans, @wheresrhys
  2. None
  3. None
  4. None
  5. If your data is a graph, and we’re all thinking

    in graphs, why is it that only one thin layer actually is a graph?
  6. Rhys Evans Principal Engineer Reliability Engineering Financial Times @wheresrhys

  7. Why graphs interest us at the FT

  8. Why we use neo4j graph database

  9. Combining GraphQL and neo4j

  10. Principles for easy(ish) graph evolution

  11. None
  12. None
  13. None
  14. Just one system

  15. None
  16. Some credentials have been leaked – could any user data

    have been compromised?
  17. System Data Store Credentials Data Type

  18. We’ve changed our T&C’s – which affiliates do we ask

    to update their websites?
  19. Website Affiliate Smallprint

  20. Which products cost more to run than they raise in

    revenue?
  21. Product Infrastructure Revenue Stream Budget Stream

  22. www.ft.com front page is returning a 404 – who can

    I contact to fix it at 3am?
  23. System User Journey Monitoring Maintainer

  24. “Wir müssen wissen, wir werden wissen.” ― David Hilbert

  25. We come to understand things by constructing a model

  26. Systems Teams System Team Join People Team Person Join System

    Person Join Budget lines Modules Data stores System Data Read Join System Module Join System Data Write Join
  27. “Recursive dependents of systems reading from a DB” = Table

    x Table x Table x ...
  28. All the Systems All the System joins All the Systems

    All the System joins All the Systems All the System joins ...
  29. O(no)

  30. Graph databases such as neo4j optimise for data which is

    highly connected
  31. No global lookup by ID Uses a local pointer to

    the exact physical location of the data
  32. Record Linked list of relationships Related records

  33. “Recursive dependents of systems reading from a DB” = record

    –> siblings –> cousins -> ...
  34. O(k)

  35. (This)-[:IS_CONNECTED_TO]->(That) MATCH(s:System)-[:DEPENDS_ON*]->(:System) -[:READS_FROM]->(d:Database) WHERE d.code = "credit-cards" RETURN s.accessLogsUrl Cypher

    ≃ ASCII art + SQL
  36. CREATE CONSTRAINT ON (s:System) ASSERT s.code IS UNIQUE CREATE (s:System)

    SET s = $properties MERGE (s)-[r:DEPENDS_ON]-(s2) WHERE s2.code = “dependency” RETURN s, r, s2 Constructing the graph
  37. Our graph model

  38. Where does GraphQL fit in?

  39. 1. Self-service 2. Everyone a power user 3. Low effort

    extensibility 4. API and UI for everything
  40. 1. Self-service 2. Everyone a power user 3. Low effort

    extensibility 4. API and UI for everything
  41. We need an API that can represent graphs and allows

    users to define the data they need
  42. ?

  43. neo4j-graphql-js - converts GraphQL queries to cypher - resolves with

    a single database query
  44. Cypher: (This)-[:RELATED_TO]->(That) GraphQL: type This { relatedThats: [That] } type

    That { relatedThiss: [This] } Different semantics
  45. @relation directive type System { code: String sla: SLA dependencies:

    [System] @relation(name: "DEPENDS_ON", direction: "OUT") dependents: [System] @relation(name: "DEPENDS_ON", direction: "IN") }
  46. import { v1 as neo4j } from 'neo4j-driver'; import {

    makeAugmentedSchema } from 'neo4j-graphql-js'; const driver = neo4j.driver(...); router.use('/graphql', graphqlExpress(() => ({ schema: makeAugmentedSchema({typeDefs}), context: { driver } }) ) Resolver generation
  47. None
  48. None
  49. None
  50. N + 1 problem goes away N + 1 →

    1 N(M + 1) + 1 → 1 N(M(P + 1) + 1) + 1 → 1
  51. { Systems(filter: { knownAboutBy_every:{isActive:false} }){ code } } Filters

  52. type Team { stakeholderTeams: [Team] @cypher( statement: "MATCH (this)<-[:DELIVERED_BY]-(:System) <-[:DEPENDS_ON*]-(:System)<-[:DELIVERED_BY]-

    (t:Team) RETURN t" ) } @cypher directive
  53. CALL algo.pageRank('System', 'DEPENDS_ON', {iterations:20, dampingFactor:0.85, write: true, writeProperty:"criticality"}) type System

    { code: String criticality: Float } AI - Graph algorithms
  54. #GRANDstack GraphQL + React + Apollo + Neo4j Database https://grandstack.io/

  55. 1. Self-service 2. Everyone a power user 3. Low effort

    extensibility 4. API and UI for everything
  56. Say we want to add a new type and edge

    to the graph System Hosting Platform HOSTED_BY
  57. How do we add this to the data layer? System

    Hosting Platform HOSTED_BY
  58. CREATE CONSTRAINT ON (h:HostingPlatform) ASSERT h.code IS UNIQUE We just

    create an index
  59. How do we add this to the API layer? System

    Hosting Platform HOSTED_BY
  60. type HostingPlatform { code: String hostsSystems: [System] @relation(name: "HOSTED_ON", direction:

    "IN") } extend type System { hostedOn: [HostingPlatform] @relation(name: "DEPENDS_ON", direction: "OUT") } Add it to the schema
  61. Schema first vs Code first This is schema first… right?

  62. Why prefer Code First? ‑ DRY ‑ Declarative (via static

    analysis/coding patterns)
  63. neo4j-graphql-js: - DRY - only write schema - Static analysis

    of schema generates resolvers
  64. Hot reloading Updating the schema and API without redeploying

  65. // function that returns GraphQL middleware const constructAPI = schema

    => { api = graphqlExpress(() => ({ schema: makeAugmentedSchema({schema}) }) } let api; constructAPI(initialSchema) schemaFilePoller.on('change', constructAPI); app.post('/graphql', (...args) => api(...args)); Schema hot reloading
  66. How do we add this to the UI layer? System

    Hosting Platform HOSTED_BY
  67. Our UI is a pretty standard set of JSX components

  68. Each primitive type - String, Boolean etc - has a

    corresponding component
  69. GraphQL types are rendered using combinations of these primitive components

  70. Relationship editor Boolean editor Enum editor

  71. Metadata available in GraphQL - Type - Property name -

    Description
  72. None
  73. None
  74. None
  75. also - Handling inactive records - Required fields - Validation

    patterns ...
  76. Defining these in a different location to the schema is

    hard to maintain
  77. Custom yaml schema name: System description: Any combination of …

    properties: code: type: String required: true useInSummary: true pattern:^(?=.{2,64}$)[a-z0-9]+(?:-[a-z0-9]+)*$ label: Code description: The unique id …
  78. const types = schema.getTypes().map(defineType); const enums = schema.getEnums().map(defineEnum); const queries

    = schema.getTypes().map(defineQueries) return [].concat( types, 'type Query {\n', ...queries, '}', enums, ); Transform to GraphQL SDL
  79. Schema files GraphQL API Admin UI API Search index REST

    API
  80. None
  81. CFECSWYD

  82. Code first... except the code is a schema written in

    YAML... development
  83. Because what really matters ‑ DRY ‑ Declarative (or static)

    ‑ Co-location
  84. GraphQL feature request: Support front matter in field descriptions so

    we can all stop saying CFECSWYD!
  85. GraphQL + neo4j is the one true path?

  86. Of course not

  87. neo4j a poor choice for: - Large documents/blobs - Time

    series - Other things SQL/NoSQL perform well at
  88. neo4j-graphql-js only generates resolvers where your code does not already

    provide one
  89. import { v1 as neo4j } from 'neo4j-driver'; import {

    makeAugmentedSchema } from 'neo4j-graphql-js'; const driver = neo4j.driver(...); router.use('/graphql', graphqlExpress(() => ({ schema: makeAugmentedSchema({typeDefs}), resolvers: … , context: { driver } }) ) Custom resolvers
  90. We use S3 for large documents Lots of other potential

    data sources
  91. None
  92. None
  93. Neo4j is O(k) at modelling graph data

  94. #GRANDstack gives easy access with GraphQL

  95. - DRY - Declarative - Co-locate

  96. Cheers @wheresrhys