A Field Guide to the Financial Times

A field guide to the Financial Times Rhys Evans Principal
Engineer, Financial Times @wheresrhys

@wheresrhys Who I am • Worked in tech 10+ years
• Gradually moved into tooling • Co-lead the FT’s Reliability Engineering team • Lifelong birdwatcher

@wheresrhys From Wikipedia: A book designed to help the reader
identify wildlife (plants or animals) or other objects of natural occurrence (e.g. minerals). What is a field guide

• Why the FT needs a field guide • Organising
our guide with neo4j and GraphQL • Filling in the details

Why the FT needs a field guide

@wheresrhys Insert non dramatic screenshot

@wheresrhys

@wheresrhys “A tool dating from before the trees that built
the ark. Unowned, unknown, and worth £250k of business. One day it fell over. We founds docs dated 1999... which helped” Greg Cope, Tech Director, FT

@wheresrhys Starting about 5 years ago, the range of tech
we have to support exploded

@wheresrhys Previously Centralised decision making Monolithic architectures Data centres Infrequent
releases

Move slow and achieve little

@wheresrhys Microservices FT were early adopters of microservices architecture Lots
of independently deployed services easier to • Pick the right tool for the job • Release and iterate • Replace and decommission

@wheresrhys Liberalisation Matt Chadburn http://matt.chadburn.co.uk/notes/teams-as-services.html “[...] follow the mechanics of
free-market economy. Teams are allowed and encouraged to pick the best value tools for the job at hand”

@wheresrhys OUT IN Data Centre Your favourite cloud ‘The FT
Platform’ Pick your own SaaS Java, Java, Java I hear Rust’s good... Ivory tower What works

@wheresrhys “The upside of this is teams, left to their
own devices, and trusted to make responsible decisions will choose what is best for themselves and the business in the long-term.” Matt Chadburn http://matt.chadburn.co.uk/notes/teams-as-services.html

Build stuff and disappear

@wheresrhys Legacy is sooner than you think • All images
appearing on our websites relied on 1 person... who left • A vanity url service built by a feature team that disbanded shortly after • Part of our membership platform built in a niche language • And many, many more

@wheresrhys 5 years is a long time in tech Long
enough for • Shiny new things to become legacy • Budgets and business priorities to move on • People to leave

@wheresrhys • Have to keep lots of tech ticking over
• Generating more new stuff than ever before to keep track of • Liberalising the tech department leads to ownership & maintenance problems Need a field guide to help us navigate the space In summary

Unowned & unknown

Owned & known

Organising our guide with neo4j and GraphQL

@wheresrhys • Reaffirm who owns the various bits of FT
tech • Improve information about what is actually running and why • Determine what state it’s in at any given time 3 priorities to improve reliability

@wheresrhys Who is our audience? Operations team • Active 24/7
• Broad knowledge of our tech platforms • Need to know which approaches can be applied to incident X • If nothing works, who to call

@wheresrhys CMDB versions 1 - 3 were: • Too inert
- Enter once and forget about it • Too brittle - Chains of responsibility easily lost • Too discrete - Hard to make important connections Not the first attempt

@wheresrhys • The natural question to ask when addressing a
problem • Links between people and things dotted all over our previous CMDBs • Intuitive but brittle Who can help me with system X?

@wheresrhys • Hard to connect data, so get overly simplified
models of reality • Several degrees of separation is modelled as a systemOwner field • Simple, but inaccurate and hard to maintain Relational databases constrain

@wheresrhys • Designed to model complex relationships • No need
to simplify and abstract away details that actually matter • If person X is a stakeholder via 4 degrees of separation, represent them as such Graph databases liberate

@wheresrhys A graph restatement of the problem ‘How can I
ensure systems are assigned to the right people’ → ‘How can I ensure systems are connected somehow to the right people’

@wheresrhys System ? ? ? ? ? ? ? ?

Model the stable stuff first Model the stable stuff first

@wheresrhys • Pick a unique, human readable code • Kill
infrastructure not tagged with it • In our graph, the System record must be connected to a Team When systems are created we:

@wheresrhys • Stable, manageable subdivisions of the organisation • Tech
director who is ultimately responsible On top of this stable foundation we can add the more ephemeral things Our tech connected to

@wheresrhys BIZ-OPS MAN

@wheresrhys • Self-service • No such thing as a power
user • Extensible • API first, but UI a close second Data warehouse free

@wheresrhys REST API • OK when fetching a single record
type • Painful to traverse ‘Canned query’ endpoints • Less generic • Limited by our imagination Some poor API options

@wheresrhys GraphQL to the rescue “GraphQL is a query language
for APIs [...] gives clients the power to ask for exactly what they need [...] not just the properties of one resource but also smoothly follows references between them”

@wheresrhys neo4j-graphql-js • GraphQL normally talks to multiple APIs and
combines the results • neo4j-graphql-js converts GraphQL queries to cypher, and talks to neo4j directly

@wheresrhys

@wheresrhys GraphQL big wins • User friendly: Single, grokable query
to get unlimited connected info • Future proof: Mirrors the neo4j graph as its complexity grows • More efficient: Fewer API calls and fewer and faster DB calls

@wheresrhys • Hungry users: Allows unwitting construction of very expensive
queries • Caching: Not obvious what caching behaviour to implement • To write or not to write: Not persuaded to move away from REST yet Pitfalls of GraphQL

@wheresrhys An extensible UI

@wheresrhys

#GRANDstack GraphQL + React + Apollo + Neo4j Database https://grandstack.io/

@wheresrhys In summary • Some confidence that Biz Ops won’t
degrade into a data graveyard • Unlimited access to data for any person or machine But is the data actually any good?

Filling in the details

@wheresrhys Not the first attempt CMDB versions 1 -3 were
• Too inert - Enter once and forget about it • Too brittle - Chains of responsibility easily lost • Too discrete - Hard to make important connections

@wheresrhys Don’t rely on good behaviour • Automate • More
carrot, less stick • Gamify • UX

@wheresrhys Automate • Machines don’t forget to update information •
Restrict write access for certain records/types to privileged clients ◦ people-api → Writes details of FT staff ◦ github-importer → Writes details of repositories ◦ …

@wheresrhys More carrot, less stick

@wheresrhys Gamify Teams respond well to seeing how they compare,
and how they can improve

@wheresrhys UX

@wheresrhys

@wheresrhys Not just visual design • Understand your users •
Uncover sources of friction • Learn about their existing/ideal workflow • Don’t expect them to come to you • “Good design is invisible”

@wheresrhys • System source code changes in Github, • But
runbook authorship in Biz Ops • Bound to get out of step • What if they happened concurrently? Example: runbook authorship

@wheresrhys • Runbooks written in RUNBOOK.md with front matter metadata
• Content pulled into Biz Ops when production code release detected • Github PR integrations to follow Example: runbook authorship

@wheresrhys • Underpinning how we handle GDPR requests • Quicker
triaging of security incidents • Integrating with leavers process More benefits → more incentives to improve data Beyond operational info

What have we learned today?

Model the stable stuff first Legacy code comes to us
all

Model the stable stuff first Documented legacy is good legacy

Model the stable stuff first Graphs enable more powerful modelling

Model the stable stuff first Using #GRANDstack is like being
the film version of Mark Zuckerberg

Model the stable stuff first Your data won’t update itself

Model the stable stuff first UX and other feedback loops
can keep it fresh

Thank you The team: Geoff Thorpe, Laura Carvajal, Charlie Briggs,
Katie Koschland, Simon Legg, Maggie Allen, Courtney Osborn, Kat Downes, Sentayhu Mekoonnali, David Balfour Images from: https://www.audubon.org/birds-of-america/ @wheresrhys www.ft.com/dev/null

A Field Guide to the Financial Times

A Field Guide to the Financial Times

More Decks by Rhys Evans

Other Decks in Technology

Featured

Transcript