Pro Yearly is on sale from $80 to $50! »

A Field Guide to the Financial Times

A Field Guide to the Financial Times

The FT was a microservices pioneer, and our teams had a lot of freedom to pick the tools & processes they wanted. 5 years on, many people have moved on and those innovative projects are now legacy code. I’ll tell you about our journey, using neo4j & graphQL, towards keeping track of it all.

6fe43e0038cf0e5579b549d417d4f3ec?s=128

Rhys Evans

March 26, 2019
Tweet

Transcript

  1. A field guide to the Financial Times Rhys Evans Principal

    Engineer, Financial Times @wheresrhys
  2. @wheresrhys Who I am • Worked in tech 10+ years

    • Gradually moved into tooling • Co-lead the FT’s Reliability Engineering team • Lifelong birdwatcher
  3. @wheresrhys From Wikipedia: A book designed to help the reader

    identify wildlife (plants or animals) or other objects of natural occurrence (e.g. minerals). What is a field guide
  4. • Why the FT needs a field guide • Organising

    our guide with neo4j and GraphQL • Filling in the details
  5. Why the FT needs a field guide

  6. @wheresrhys Insert non dramatic screenshot

  7. @wheresrhys

  8. None
  9. @wheresrhys

  10. @wheresrhys

  11. @wheresrhys

  12. @wheresrhys “A tool dating from before the trees that built

    the ark. Unowned, unknown, and worth £250k of business. One day it fell over. We founds docs dated 1999... which helped” Greg Cope, Tech Director, FT
  13. @wheresrhys Starting about 5 years ago, the range of tech

    we have to support exploded
  14. @wheresrhys Previously Centralised decision making Monolithic architectures Data centres Infrequent

    releases
  15. Move slow and achieve little

  16. @wheresrhys Microservices FT were early adopters of microservices architecture Lots

    of independently deployed services easier to • Pick the right tool for the job • Release and iterate • Replace and decommission
  17. @wheresrhys Liberalisation Matt Chadburn http://matt.chadburn.co.uk/notes/teams-as-services.html “[...] follow the mechanics of

    free-market economy. Teams are allowed and encouraged to pick the best value tools for the job at hand”
  18. @wheresrhys OUT IN Data Centre Your favourite cloud ‘The FT

    Platform’ Pick your own SaaS Java, Java, Java I hear Rust’s good... Ivory tower What works
  19. @wheresrhys “The upside of this is teams, left to their

    own devices, and trusted to make responsible decisions will choose what is best for themselves and the business in the long-term.” Matt Chadburn http://matt.chadburn.co.uk/notes/teams-as-services.html
  20. Build stuff and disappear

  21. @wheresrhys Legacy is sooner than you think • All images

    appearing on our websites relied on 1 person... who left • A vanity url service built by a feature team that disbanded shortly after • Part of our membership platform built in a niche language • And many, many more
  22. @wheresrhys 5 years is a long time in tech Long

    enough for • Shiny new things to become legacy • Budgets and business priorities to move on • People to leave
  23. @wheresrhys • Have to keep lots of tech ticking over

    • Generating more new stuff than ever before to keep track of • Liberalising the tech department leads to ownership & maintenance problems Need a field guide to help us navigate the space In summary
  24. Unowned & unknown

  25. Owned & known

  26. Organising our guide with neo4j and GraphQL

  27. @wheresrhys • Reaffirm who owns the various bits of FT

    tech • Improve information about what is actually running and why • Determine what state it’s in at any given time 3 priorities to improve reliability
  28. @wheresrhys Who is our audience? Operations team • Active 24/7

    • Broad knowledge of our tech platforms • Need to know which approaches can be applied to incident X • If nothing works, who to call
  29. @wheresrhys CMDB versions 1 - 3 were: • Too inert

    - Enter once and forget about it • Too brittle - Chains of responsibility easily lost • Too discrete - Hard to make important connections Not the first attempt
  30. @wheresrhys • The natural question to ask when addressing a

    problem • Links between people and things dotted all over our previous CMDBs • Intuitive but brittle Who can help me with system X?
  31. @wheresrhys • Hard to connect data, so get overly simplified

    models of reality • Several degrees of separation is modelled as a systemOwner field • Simple, but inaccurate and hard to maintain Relational databases constrain
  32. @wheresrhys • Designed to model complex relationships • No need

    to simplify and abstract away details that actually matter • If person X is a stakeholder via 4 degrees of separation, represent them as such Graph databases liberate
  33. @wheresrhys A graph restatement of the problem ‘How can I

    ensure systems are assigned to the right people’ → ‘How can I ensure systems are connected somehow to the right people’
  34. @wheresrhys System ? ? ? ? ? ? ? ?

  35. Model the stable stuff first Model the stable stuff first

  36. @wheresrhys • Pick a unique, human readable code • Kill

    infrastructure not tagged with it • In our graph, the System record must be connected to a Team When systems are created we:
  37. None
  38. @wheresrhys • Stable, manageable subdivisions of the organisation • Tech

    director who is ultimately responsible On top of this stable foundation we can add the more ephemeral things Our tech connected to
  39. None
  40. @wheresrhys BIZ-OPS MAN

  41. @wheresrhys • Self-service • No such thing as a power

    user • Extensible • API first, but UI a close second Data warehouse free
  42. @wheresrhys REST API • OK when fetching a single record

    type • Painful to traverse ‘Canned query’ endpoints • Less generic • Limited by our imagination Some poor API options
  43. @wheresrhys GraphQL to the rescue “GraphQL is a query language

    for APIs [...] gives clients the power to ask for exactly what they need [...] not just the properties of one resource but also smoothly follows references between them”
  44. @wheresrhys neo4j-graphql-js • GraphQL normally talks to multiple APIs and

    combines the results • neo4j-graphql-js converts GraphQL queries to cypher, and talks to neo4j directly
  45. @wheresrhys

  46. @wheresrhys GraphQL big wins • User friendly: Single, grokable query

    to get unlimited connected info • Future proof: Mirrors the neo4j graph as its complexity grows • More efficient: Fewer API calls and fewer and faster DB calls
  47. @wheresrhys • Hungry users: Allows unwitting construction of very expensive

    queries • Caching: Not obvious what caching behaviour to implement • To write or not to write: Not persuaded to move away from REST yet Pitfalls of GraphQL
  48. @wheresrhys An extensible UI

  49. None
  50. @wheresrhys

  51. #GRANDstack GraphQL + React + Apollo + Neo4j Database https://grandstack.io/

  52. @wheresrhys In summary • Some confidence that Biz Ops won’t

    degrade into a data graveyard • Unlimited access to data for any person or machine But is the data actually any good?
  53. Filling in the details

  54. @wheresrhys Not the first attempt CMDB versions 1 -3 were

    • Too inert - Enter once and forget about it • Too brittle - Chains of responsibility easily lost • Too discrete - Hard to make important connections
  55. @wheresrhys Don’t rely on good behaviour • Automate • More

    carrot, less stick • Gamify • UX
  56. @wheresrhys Automate • Machines don’t forget to update information •

    Restrict write access for certain records/types to privileged clients ◦ people-api → Writes details of FT staff ◦ github-importer → Writes details of repositories ◦ …
  57. @wheresrhys More carrot, less stick

  58. @wheresrhys Gamify Teams respond well to seeing how they compare,

    and how they can improve
  59. @wheresrhys UX

  60. @wheresrhys

  61. @wheresrhys Not just visual design • Understand your users •

    Uncover sources of friction • Learn about their existing/ideal workflow • Don’t expect them to come to you • “Good design is invisible”
  62. @wheresrhys • System source code changes in Github, • But

    runbook authorship in Biz Ops • Bound to get out of step • What if they happened concurrently? Example: runbook authorship
  63. @wheresrhys • Runbooks written in RUNBOOK.md with front matter metadata

    • Content pulled into Biz Ops when production code release detected • Github PR integrations to follow Example: runbook authorship
  64. @wheresrhys • Underpinning how we handle GDPR requests • Quicker

    triaging of security incidents • Integrating with leavers process More benefits → more incentives to improve data Beyond operational info
  65. What have we learned today?

  66. Model the stable stuff first Legacy code comes to us

    all
  67. Model the stable stuff first Documented legacy is good legacy

  68. Model the stable stuff first Graphs enable more powerful modelling

  69. Model the stable stuff first Using #GRANDstack is like being

    the film version of Mark Zuckerberg
  70. Model the stable stuff first Your data won’t update itself

  71. Model the stable stuff first UX and other feedback loops

    can keep it fresh
  72. Thank you The team: Geoff Thorpe, Laura Carvajal, Charlie Briggs,

    Katie Koschland, Simon Legg, Maggie Allen, Courtney Osborn, Kat Downes, Sentayhu Mekoonnali, David Balfour Images from: https://www.audubon.org/birds-of-america/ @wheresrhys www.ft.com/dev/null