Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Graph vs Graph

Rhys Evans
September 06, 2019

Graph vs Graph

Most stories of GraphQL implementations focus on retrofitting a GraphQL layer on top of existing APIs to surface the data's graphiness. But what if you're starting from scratch and you know you'll need a graph from day 1 - what role is there for GraphQL when it's not the only graph representation in your stack. I'll tell you how, at the FT, we've combined neo4j graph database with GraphQL to expose operational information about our business like never before.

Rhys Evans

September 06, 2019
Tweet

More Decks by Rhys Evans

Other Decks in Technology

Transcript

  1. Graph vs Graph
    GraphQL as the API for your graph
    database
    Rhys Evans, @wheresrhys

    View full-size slide

  2. If your data is a graph, and
    we’re all thinking in graphs,
    why is it that only one thin
    layer actually is a graph?

    View full-size slide

  3. Rhys Evans
    Principal Engineer
    Reliability Engineering
    Financial Times
    @wheresrhys

    View full-size slide

  4. Why graphs
    interest us
    at the FT

    View full-size slide

  5. Why we use
    neo4j graph
    database

    View full-size slide

  6. Combining
    GraphQL and
    neo4j

    View full-size slide

  7. Principles for
    easy(ish) graph
    evolution

    View full-size slide

  8. Just one system

    View full-size slide

  9. Some credentials have been
    leaked – could any user data
    have been compromised?

    View full-size slide

  10. System
    Data Store
    Credentials
    Data Type

    View full-size slide

  11. We’ve changed our T&C’s –
    which affiliates do we ask to
    update their websites?

    View full-size slide

  12. Website
    Affiliate
    Smallprint

    View full-size slide

  13. Which products cost more
    to run than they raise in
    revenue?

    View full-size slide

  14. Product
    Infrastructure
    Revenue
    Stream
    Budget
    Stream

    View full-size slide

  15. www.ft.com front page is
    returning a 404 – who can I
    contact to fix it at 3am?

    View full-size slide

  16. System
    User Journey
    Monitoring
    Maintainer

    View full-size slide

  17. “Wir müssen wissen,
    wir werden wissen.”
    ― David Hilbert

    View full-size slide

  18. We come to understand
    things by constructing a
    model

    View full-size slide

  19. Systems
    Teams
    System
    Team
    Join
    People Team
    Person
    Join
    System
    Person
    Join
    Budget
    lines
    Modules
    Data
    stores
    System
    Data
    Read
    Join
    System
    Module
    Join
    System
    Data
    Write
    Join

    View full-size slide

  20. “Recursive dependents of
    systems reading from a DB”
    =
    Table x Table x Table x ...

    View full-size slide

  21. All the
    Systems
    All the
    System
    joins
    All the
    Systems
    All the
    System
    joins
    All the
    Systems
    All the
    System
    joins
    ...

    View full-size slide

  22. Graph databases
    such as neo4j
    optimise for data
    which is highly
    connected

    View full-size slide

  23. No global lookup by ID
    Uses a local pointer to the
    exact physical location of
    the data

    View full-size slide

  24. Record
    Linked list of
    relationships
    Related
    records

    View full-size slide

  25. “Recursive dependents of
    systems reading from a DB”
    =
    record –> siblings –>
    cousins -> ...

    View full-size slide

  26. (This)-[:IS_CONNECTED_TO]->(That)
    MATCH(s:System)-[:DEPENDS_ON*]->(:System)
    -[:READS_FROM]->(d:Database)
    WHERE d.code = "credit-cards"
    RETURN s.accessLogsUrl
    Cypher ≃ ASCII art + SQL

    View full-size slide

  27. CREATE CONSTRAINT ON (s:System)
    ASSERT s.code IS UNIQUE
    CREATE (s:System)
    SET s = $properties
    MERGE (s)-[r:DEPENDS_ON]-(s2)
    WHERE s2.code = “dependency”
    RETURN s, r, s2
    Constructing the graph

    View full-size slide

  28. Our graph model

    View full-size slide

  29. Where does
    GraphQL fit in?

    View full-size slide

  30. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View full-size slide

  31. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View full-size slide

  32. We need an API that can
    represent graphs and
    allows users to define
    the data they need

    View full-size slide

  33. neo4j-graphql-js
    - converts GraphQL queries
    to cypher
    - resolves with a single
    database query

    View full-size slide

  34. Cypher: (This)-[:RELATED_TO]->(That)
    GraphQL:
    type This {
    relatedThats: [That]
    }
    type That {
    relatedThiss: [This]
    }
    Different semantics

    View full-size slide

  35. @relation directive
    type System {
    code: String
    sla: SLA
    dependencies: [System] @relation(name:
    "DEPENDS_ON", direction: "OUT")
    dependents: [System] @relation(name:
    "DEPENDS_ON", direction: "IN")
    }

    View full-size slide

  36. import { v1 as neo4j } from 'neo4j-driver';
    import { makeAugmentedSchema } from
    'neo4j-graphql-js';
    const driver = neo4j.driver(...);
    router.use('/graphql',
    graphqlExpress(() => ({
    schema: makeAugmentedSchema({typeDefs}),
    context: { driver }
    })
    )
    Resolver generation

    View full-size slide

  37. N + 1 problem goes away
    N + 1 → 1
    N(M + 1) + 1 → 1
    N(M(P + 1) + 1) + 1 → 1

    View full-size slide

  38. {
    Systems(filter: {
    knownAboutBy_every:{isActive:false}
    }){
    code
    }
    }
    Filters

    View full-size slide

  39. type Team {
    stakeholderTeams: [Team] @cypher(
    statement:
    "MATCH (this)<-[:DELIVERED_BY]-(:System)
    <-[:DEPENDS_ON*]-(:System)<-[:DELIVERED_BY]-
    (t:Team) RETURN t"
    )
    }
    @cypher directive

    View full-size slide

  40. CALL algo.pageRank('System', 'DEPENDS_ON',
    {iterations:20, dampingFactor:0.85, write: true,
    writeProperty:"criticality"})
    type System {
    code: String
    criticality: Float
    }
    AI - Graph algorithms

    View full-size slide

  41. #GRANDstack
    GraphQL + React + Apollo +
    Neo4j Database
    https://grandstack.io/

    View full-size slide

  42. 1. Self-service
    2. Everyone a power user
    3. Low effort extensibility
    4. API and UI for everything

    View full-size slide

  43. Say we want to add a new
    type and edge to the graph
    System
    Hosting
    Platform
    HOSTED_BY

    View full-size slide

  44. How do we add this to the
    data layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View full-size slide

  45. CREATE CONSTRAINT ON (h:HostingPlatform)
    ASSERT h.code IS UNIQUE
    We just create an index

    View full-size slide

  46. How do we add this to the
    API layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View full-size slide

  47. type HostingPlatform {
    code: String
    hostsSystems: [System] @relation(name:
    "HOSTED_ON", direction: "IN")
    }
    extend type System {
    hostedOn: [HostingPlatform] @relation(name:
    "DEPENDS_ON", direction: "OUT")
    }
    Add it to the schema

    View full-size slide

  48. Schema first
    vs
    Code first
    This is schema first… right?

    View full-size slide

  49. Why prefer Code First?
    ‑ DRY
    ‑ Declarative (via static
    analysis/coding patterns)

    View full-size slide

  50. neo4j-graphql-js:
    - DRY - only write schema
    - Static analysis of schema
    generates resolvers

    View full-size slide

  51. Hot reloading
    Updating the schema and
    API without redeploying

    View full-size slide

  52. // function that returns GraphQL middleware
    const constructAPI = schema => {
    api = graphqlExpress(() => ({
    schema: makeAugmentedSchema({schema})
    })
    }
    let api;
    constructAPI(initialSchema)
    schemaFilePoller.on('change', constructAPI);
    app.post('/graphql', (...args) => api(...args));
    Schema hot reloading

    View full-size slide

  53. How do we add this to the UI
    layer?
    System
    Hosting
    Platform
    HOSTED_BY

    View full-size slide

  54. Our UI is a pretty standard
    set of JSX components

    View full-size slide

  55. Each primitive type - String,
    Boolean etc - has a
    corresponding component

    View full-size slide

  56. GraphQL types are rendered
    using combinations of these
    primitive components

    View full-size slide

  57. Relationship editor
    Boolean editor
    Enum editor

    View full-size slide

  58. Metadata available in
    GraphQL
    - Type
    - Property name
    - Description

    View full-size slide

  59. also
    - Handling inactive records
    - Required fields
    - Validation patterns
    ...

    View full-size slide

  60. Defining these in a different
    location to the schema is
    hard to maintain

    View full-size slide

  61. Custom yaml schema
    name: System
    description: Any combination of …
    properties:
    code:
    type: String
    required: true
    useInSummary: true
    pattern:^(?=.{2,64}$)[a-z0-9]+(?:-[a-z0-9]+)*$
    label: Code
    description: The unique id …

    View full-size slide

  62. const types = schema.getTypes().map(defineType);
    const enums = schema.getEnums().map(defineEnum);
    const queries =
    schema.getTypes().map(defineQueries)
    return [].concat(
    types,
    'type Query {\n',
    ...queries,
    '}',
    enums,
    );
    Transform to GraphQL SDL

    View full-size slide

  63. Schema
    files
    GraphQL
    API
    Admin UI
    API
    Search
    index
    REST
    API

    View full-size slide

  64. Code first... except the code
    is a schema written in
    YAML... development

    View full-size slide

  65. Because what really matters
    ‑ DRY
    ‑ Declarative (or static)
    ‑ Co-location

    View full-size slide

  66. GraphQL feature request:
    Support front matter in field
    descriptions so we can all
    stop saying CFECSWYD!

    View full-size slide

  67. GraphQL + neo4j is
    the one true path?

    View full-size slide

  68. Of course not

    View full-size slide

  69. neo4j a poor choice for:
    - Large documents/blobs
    - Time series
    - Other things SQL/NoSQL
    perform well at

    View full-size slide

  70. neo4j-graphql-js only
    generates resolvers where
    your code does not already
    provide one

    View full-size slide

  71. import { v1 as neo4j } from 'neo4j-driver';
    import { makeAugmentedSchema } from
    'neo4j-graphql-js';
    const driver = neo4j.driver(...);
    router.use('/graphql',
    graphqlExpress(() => ({
    schema: makeAugmentedSchema({typeDefs}),
    resolvers: … ,
    context: { driver }
    })
    )
    Custom resolvers

    View full-size slide

  72. We use S3 for large
    documents
    Lots of other potential data
    sources

    View full-size slide

  73. Neo4j is O(k)
    at modelling
    graph data

    View full-size slide

  74. #GRANDstack
    gives easy access
    with GraphQL

    View full-size slide

  75. - DRY
    - Declarative
    - Co-locate

    View full-size slide

  76. Cheers
    @wheresrhys

    View full-size slide