Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Graph vs Graph

Rhys Evans
September 06, 2019

Graph vs Graph

Most stories of GraphQL implementations focus on retrofitting a GraphQL layer on top of existing APIs to surface the data's graphiness. But what if you're starting from scratch and you know you'll need a graph from day 1 - what role is there for GraphQL when it's not the only graph representation in your stack. I'll tell you how, at the FT, we've combined neo4j graph database with GraphQL to expose operational information about our business like never before.

Rhys Evans

September 06, 2019
Tweet

More Decks by Rhys Evans

Other Decks in Technology

Transcript

  1. Graph vs Graph
    GraphQL as the API for your graph
    database
    Rhys Evans, @wheresrhys

    View Slide

  2. View Slide

  3. View Slide

  4. View Slide

  5. If your data is a graph, and
    we’re all thinking in graphs,
    why is it that only one thin
    layer actually is a graph?

    View Slide

  6. Rhys Evans
    Principal Engineer
    Reliability Engineering
    Financial Times
    @wheresrhys

    View Slide

  7. Why graphs
    interest us
    at the FT

    View Slide

  8. Why we use
    neo4j graph
    database

    View Slide

  9. Combining
    GraphQL and
    neo4j

    View Slide

  10. Principles for
    easy(ish) graph
    evolution

    View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. Just one system

    View Slide

  15. View Slide

  16. Some credentials have been
    leaked – could any user data
    have been compromised?

    View Slide

  17. System
    Data Store
    Credentials
    Data Type

    View Slide

  18. We’ve changed our T&C’s –
    which affiliates do we ask to
    update their websites?

    View Slide

  19. Website
    Affiliate
    Smallprint

    View Slide

  20. Which products cost more
    to run than they raise in
    revenue?

    View Slide

  21. Product
    Infrastructure
    Revenue
    Stream
    Budget
    Stream

    View Slide

  22. www.ft.com front page is
    returning a 404 – who can I
    contact to fix it at 3am?

    View Slide

  23. System
    User Journey
    Monitoring
    Maintainer

    View Slide

  24. “Wir müssen wissen,
    wir werden wissen.”
    ― David Hilbert

    View Slide

  25. We come to understand
    things by constructing a
    model

    View Slide

  26. Systems
    Teams
    System
    Team
    Join
    People Team
    Person
    Join
    System
    Person
    Join
    Budget
    lines
    Modules
    Data
    stores
    System
    Data
    Read
    Join
    System
    Module
    Join
    System
    Data
    Write
    Join

    View Slide

  27. “Recursive dependents of
    systems reading from a DB”
    =
    Table x Table x Table x ...

    View Slide

  28. All the
    Systems
    All the
    System
    joins
    All the
    Systems
    All the
    System
    joins
    All the
    Systems
    All the
    System
    joins
    ...

    View Slide

  29. O(no)

    View Slide

  30. Graph databases
    such as neo4j
    optimise for data
    which is highly
    connected

    View Slide

  31. No global lookup by ID
    Uses a local pointer to the
    exact physical location of
    the data

    View Slide

  32. Record
    Linked list of
    relationships
    Related
    records

    View Slide

  33. “Recursive dependents of
    systems reading from a DB”
    =
    record –> siblings –>
    cousins -> ...

    View Slide

  34. O(k)

    View Slide

  35. (This)-[:IS_CONNECTED_TO]->(That)
    MATCH(s:System)-[:DEPENDS_ON*]->(:System)
    -[:READS_FROM]->(d:Database)
    WHERE d.code = "credit-cards"
    RETURN s.accessLogsUrl
    Cypher ≃ ASCII art + SQL

    View Slide

  36. CREATE CONSTRAINT ON (s:System)
    ASSERT s.code IS UNIQUE
    CREATE (s:System)
    SET s = $properties
    MERGE (s)-[r:DEPENDS_ON]-(s2)
    WHERE s2.code = “dependency”
    RETURN s, r, s2
    Constructing the graph

    View Slide

  37. Our graph model

    View Slide

  38. Where does
    GraphQL fit in?

    View Slide

  39. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View Slide

  40. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View Slide

  41. We need an API that can
    represent graphs and
    allows users to define
    the data they need

    View Slide

  42. ?

    View Slide

  43. neo4j-graphql-js
    - converts GraphQL queries
    to cypher
    - resolves with a single
    database query

    View Slide

  44. Cypher: (This)-[:RELATED_TO]->(That)
    GraphQL:
    type This {
    relatedThats: [That]
    }
    type That {
    relatedThiss: [This]
    }
    Different semantics

    View Slide

  45. @relation directive
    type System {
    code: String
    sla: SLA
    dependencies: [System] @relation(name:
    "DEPENDS_ON", direction: "OUT")
    dependents: [System] @relation(name:
    "DEPENDS_ON", direction: "IN")
    }

    View Slide

  46. import { v1 as neo4j } from 'neo4j-driver';
    import { makeAugmentedSchema } from
    'neo4j-graphql-js';
    const driver = neo4j.driver(...);
    router.use('/graphql',
    graphqlExpress(() => ({
    schema: makeAugmentedSchema({typeDefs}),
    context: { driver }
    })
    )
    Resolver generation

    View Slide

  47. View Slide

  48. View Slide

  49. View Slide

  50. N + 1 problem goes away
    N + 1 → 1
    N(M + 1) + 1 → 1
    N(M(P + 1) + 1) + 1 → 1

    View Slide

  51. {
    Systems(filter: {
    knownAboutBy_every:{isActive:false}
    }){
    code
    }
    }
    Filters

    View Slide

  52. type Team {
    stakeholderTeams: [Team] @cypher(
    statement:
    "MATCH (this)<-[:DELIVERED_BY]-(:System)
    <-[:DEPENDS_ON*]-(:System)<-[:DELIVERED_BY]-
    (t:Team) RETURN t"
    )
    }
    @cypher directive

    View Slide

  53. CALL algo.pageRank('System', 'DEPENDS_ON',
    {iterations:20, dampingFactor:0.85, write: true,
    writeProperty:"criticality"})
    type System {
    code: String
    criticality: Float
    }
    AI - Graph algorithms

    View Slide

  54. #GRANDstack
    GraphQL + React + Apollo +
    Neo4j Database
    https://grandstack.io/

    View Slide

  55. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View Slide

  56. Say we want to add a new
    type and edge to the graph
    System
    Hosting
    Platform
    HOSTED_BY

    View Slide

  57. How do we add this to the
    data layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View Slide

  58. CREATE CONSTRAINT ON (h:HostingPlatform)
    ASSERT h.code IS UNIQUE
    We just create an index

    View Slide

  59. How do we add this to the
    API layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View Slide

  60. type HostingPlatform {
    code: String
    hostsSystems: [System] @relation(name:
    "HOSTED_ON", direction: "IN")
    }
    extend type System {
    hostedOn: [HostingPlatform] @relation(name:
    "DEPENDS_ON", direction: "OUT")
    }
    Add it to the schema

    View Slide

  61. Schema first
    vs
    Code first
    This is schema first… right?

    View Slide

  62. Why prefer Code First?
    ‑ DRY
    ‑ Declarative (via static
    analysis/coding patterns)

    View Slide

  63. neo4j-graphql-js:
    - DRY - only write schema
    - Static analysis of schema
    generates resolvers

    View Slide

  64. Hot reloading
    Updating the schema and
    API without redeploying

    View Slide

  65. // function that returns GraphQL middleware
    const constructAPI = schema => {
    api = graphqlExpress(() => ({
    schema: makeAugmentedSchema({schema})
    })
    }
    let api;
    constructAPI(initialSchema)
    schemaFilePoller.on('change', constructAPI);
    app.post('/graphql', (...args) => api(...args));
    Schema hot reloading

    View Slide

  66. How do we add this to the UI
    layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View Slide

  67. Our UI is a pretty standard
    set of JSX components

    View Slide

  68. Each primitive type - String,
    Boolean etc - has a
    corresponding component

    View Slide

  69. GraphQL types are rendered
    using combinations of these
    primitive components

    View Slide

  70. Relationship editor
    Boolean editor
    Enum editor

    View Slide

  71. Metadata available in
    GraphQL
    - Type
    - Property name
    - Description

    View Slide

  72. View Slide

  73. View Slide

  74. View Slide

  75. also
    - Handling inactive records
    - Required fields
    - Validation patterns
    ...

    View Slide

  76. Defining these in a different
    location to the schema is
    hard to maintain

    View Slide

  77. Custom yaml schema
    name: System
    description: Any combination of …
    properties:
    code:
    type: String
    required: true
    useInSummary: true
    pattern:^(?=.{2,64}$)[a-z0-9]+(?:-[a-z0-9]+)*$
    label: Code
    description: The unique id …

    View Slide

  78. const types = schema.getTypes().map(defineType);
    const enums = schema.getEnums().map(defineEnum);
    const queries =
    schema.getTypes().map(defineQueries)
    return [].concat(
    types,
    'type Query {\n',
    ...queries,
    '}',
    enums,
    );
    Transform to GraphQL SDL

    View Slide

  79. Schema
    files
    GraphQL
    API
    Admin UI
    API
    Search
    index
    REST
    API

    View Slide

  80. View Slide

  81. CFECSWYD

    View Slide

  82. Code first... except the code
    is a schema written in
    YAML... development

    View Slide

  83. Because what really matters
    ‑ DRY
    ‑ Declarative (or static)
    ‑ Co-location

    View Slide

  84. GraphQL feature request:
    Support front matter in field
    descriptions so we can all
    stop saying CFECSWYD!

    View Slide

  85. GraphQL + neo4j is
    the one true path?

    View Slide

  86. Of course not

    View Slide

  87. neo4j a poor choice for:
    - Large documents/blobs
    - Time series
    - Other things SQL/NoSQL
    perform well at

    View Slide

  88. neo4j-graphql-js only
    generates resolvers where
    your code does not already
    provide one

    View Slide

  89. import { v1 as neo4j } from 'neo4j-driver';
    import { makeAugmentedSchema } from
    'neo4j-graphql-js';
    const driver = neo4j.driver(...);
    router.use('/graphql',
    graphqlExpress(() => ({
    schema: makeAugmentedSchema({typeDefs}),
    resolvers: … ,
    context: { driver }
    })
    )
    Custom resolvers

    View Slide

  90. We use S3 for large
    documents
    Lots of other potential data
    sources

    View Slide

  91. View Slide

  92. View Slide

  93. Neo4j is O(k)
    at modelling
    graph data

    View Slide

  94. #GRANDstack
    gives easy access
    with GraphQL

    View Slide

  95. - DRY
    - Declarative
    - Co-locate

    View Slide

  96. Cheers
    @wheresrhys

    View Slide