If you can't measure it, you can't improve it. But measuring load time is easy, right? "Load time" is an outdated concept, as single page app experiences take over the web. We need better metrics to measure what our users are really feeling.
internet * They did this test when I lived there, in 2009 * The Prime Minister said: “If [the opposition] had their way Australians would be left using carrier pigeons for the future rather than accessing an internationally competitive broadband network” * So, they tested it. They flew a carrier pigeon with a 700 megabyte usb drive from the central west, to Sydney, about 100kms or 60 miles. https://www.itnews.com.au/news/australian-internet-fails-pigeon-test-159232
CPU speeds have been going up for years! * Sweet, problem solved, right? https://www.statista.com/statistics/616210/average-internet-connection-speed-in-the-us/
javascript * Median “onload” time 5.5 seconds or 15.6 seconds on mobile * https://httparchive.org/reports/state-of-javascript?start=earliest&end=latest&view=list
1.1% increase in demand * The “induced demand” effect for every 1 percent increase in highway capacity, traffic increases 0.29 to 1.1 percent in the long term (about five years out), and up to 0.68 percent in the short term (one or two years) https://trrjournalonline.trb.org/doi/abs/10.3141/2653-02?journalCode=trr
for every 1 second of improvement in load time. Every 100ms improvement also resulted in up to a 1% increase in revenue. * It’s a UX problem * https://wpostats.com/tags/revenue/
whose mobile sites load in 5 seconds earn up to 2x more mobile ad revenue than sites loading in 19 seconds. * It’s a UX problem * https://wpostats.com/tags/revenue/
and Siberia. * Actually more traffic from countries with poor connectivity! * These countries would’ve normally taken two minutes to load! * Australia not far off * Actually affects whether people can use it at all * https://wpostats.com/2015/11/11/youtube-feather.html
” Task sw itch * Research has shown that there are three different types of tasks * <0.1s Instant – the system is reacting immediately * <1s Flow – I’m in a state of flow and my concentration is not broken between tasks * <10s Task switch – I’m out of here * https://www.nngroup.com/articles/response-times-3-important-limits/
” Task sw itch ☹ * Between 1s and 10s is a whole lot of emotion too * Users are progressively getting frustrated and want to switch to another task in this period of time
” Task sw itch ☹ Median desktop 5.5s 15.6s Median mobile * Between 1s and 10s is a whole lot of emotion too * Users are progressively getting frustrated and want to switch to another task in this period of time
I’m preaching to the choir here, if you’ve come to this talk. * Do you agree with these statistics that I’ve mentioned so far? * Have you already heard it before? * Do you really care?
just fix it!” This is really tempting You probably already have an idea of what changes you need to make. “Yeah, we need to get our bundle size down, and we need to speed up this particular endpoint” However! Don’t let the takeaway from this talk be “I should go back and improve my app’s performance in this specific way”
This is a popular saying Meaning: changing (improving) something requires knowing what you want to change, and how you can tell if it worked. It’s useful because you can know if you’re doing the right things for your goals. However, I think it’s missing something
meaningfully measure I’m going to drop the word “meaningful” in there. It’s important that you’re measuring the right things Measuring something is easy, but measuring the right thing is hard You can’t make meaningful impacts without careful thought about what you’re measuring! You can’t change the right things if you’re not measuring the right things
your users Ultimately, this is what it’s about Have a user centric metric As we said before, performance is a UX problem, so we need to measure what the users really are experiencing and put some thought into that
js Fetch CSS Parse js Parse CSS Layout / paint * How does the computer see what matters to our users * I’m going to go over the metrics themselves later, slide deck is available for reference * The way that this is put together is relevant for us so we can know which parts are meaningful for the user
consider a page to be * But, remember, we’re thinking about this in a meaningful context. What is meaningful to our users? * There’s more to what they might consider a “page” * With SPA’s and PWA’s, we need to be even more nuanced about what’s going on here
page app state • An app state • Whatever your users think it is So, what’s a “page”? * We need to be keep in mind what our users would think of as a page, and have measurements around that experience
time — it’s an experience that no one metric can fully capture. There are multiple moments during the load experience that can affect whether a user perceives it as "fast" or “slow" – https://w3c.github.io/paint-timing/ https://w3c.github.io/paint-timing/
is no one event, or one easy answer to say when a page has loaded It might even require knowledge of the future! Loading is a spectrum Metrics compress this spectrum into a single number We need to be careful about how we choose this number
focus on the wrong things, neglecting real issues * You’ll change random numbers, but users will remain dissatisfied! * Bad news is good for your health if you need it
for users * We need to figure out what our users are really experiencing with our metrics * There are a number of ways that we can ensure we’re measuring this for real users
Sanity check Reflects real experiences * If it is run on real devices, that’s a good sign * We need to know what real users networks are like * Do a sanity check. When is your metric being triggered? Is that state actually what you think it is?
in the metric relates to a small improvement in the UX. * Example of something that isn’t this: total bundle size, if we’re code splitting and downloading. Reducing may do nothing
cheating! * Think of the people looking at this metric as a greedy optimisation algorithm * If they can cheat, they will * The shortest path to improve the metric should be the one that will improve the user experience * You shouldn’t be rewarded for loading a loading spinner very quickly, if that doesn’t result in a correspondingly good UX
machine’s job * As Agent Smith would say, “Never send a human to do a machines job” * Some metrics you can easily get in a repeatable way through monitoring * Others (like auditing tools) are tempting to have as one offs * Spend the time building automatic tooling to report. This is critical * If you rely on humans, this isn’t going to work * Historical data is your friend in arguments
metrics ⚖ Metric choices ⚒ Ways to measure Protect performance With the knowledge of what is a good and bad metric, lets look and some, and think about it for our use cases
First meaningful paint * You can guess my favourite (the one that has meaningful in the name) * First paint is when the browser first renders anything other than white * First contentful is when the browser renders any elements (eg. spinner) * First meaningful is when the browser renders something that is meaningful for the user (eg. data) * Easy to measure first two, but the last one is the real pot of gold. Harder to measure this in a consistent way.
page loaded over time * Speed index looks at what % of the page is loaded over time, retroactively * We then score it based on how much of it was delivered, and how soon * This really well maps to perceived load time * Optimizing for this number will help load time
UX ✅ Reflects progressive loading ⚠ Hard to measure on real devices ⚠ Hard to understand (unitless) Integrate the % of the page loaded over time * It’s one of the best metrics for reflecting progressive loading
for initial load ✅ Easy to understand ⚠ Doesn’t deal with post initial load ⚠ Hard to measure in real browsers ⚠ Your users probably scroll Time until all content above the fold is rendered * http://abovethefold.fyi/ *
js downloads ✅ Easy to measure ✅ Easy to understand ⚠ No async data requests ⚠ May not be meaningful * This one is super common, available everywhere easy to implement * Risky though
arrive in the browser? ✅ Easy to measure ✅ Measures backend problems ⚠ May not be meaningful * Performance timing API can help you measure this * It mightn’t mean anything if the first byte isn’t meaningful itself
until buttons work ✅ Measure user interactions ✅ Highly interactive apps ⚠ Need a polyfill ⚠ Less meaningful after page load * Works by detecting CPU idle time, and picks a point where probably buttons on your page will work
how long did that take? ✅ Reflects actual user pain ⚠ Depends on user input (focus on 90th percentile) * First input delay reflects actual user problems * Will naturally depend on when the user interacts
js ⏳DOMContentLoaded ⏳First byte Fetch CSS ⏳Navigation timing API Parse js Parse CSS Layout / paint ⏳First contentful paint * Remember this timeline from before? Now we can put some events on it to measure the various bits!
handler Push state Load data Rerender ⏳Page reloaded * You might reload data on button click * This is quite a different experience from the initial load, but a load nonetheless!
size First input delay ✨ First meaningful paint * If you don’t have time, these three strike a good balance between ease of adoption and in general meaningfulness. * But if you do, you should investigate first meaningful paint.
users * One of the best tools is right under your nose * Measure just about all of the previous metrics in a debugging context * You can simulate real devices, but ultimately isn’t real * Hard to make a business case from the devtools
we use to measure component load speeds and first meaningful paint, built into React * It also marks things using the user timing API, and integrates with tracking tools * This is still being refined but open to PR’s and comments, please!
users * AKA the “Audits” tab in google Chrome * Newish integration into Google chrome * Does a good job of simulating real devices (eg network speeds and device speeds)
actual users * Remember, never send a human to do a machine’s job * We need a repeatable measurement to make an impact * Lighthouse CI can tell you at PR time – before any regressions
users * Super detailed performance analysis, including details on hard to measure metrics like speed index * Can do analysis on the video, measuring speed index and ATF rendering easily * Simulates real users connections with rate limiting * You can choose locations, making it more real * Has an API to make repeatable things
Great, you’ve chosen a meaningful metric and you have the tools to measure it. Now what? * You need to fix performance issues * This is the easy bit! Plenty of info (including later today) about how to do that
everyone to agree on a limit 3. Bring it up during planning 4. Figure out how to stay in budget Performance budgets * Getting people to agree might be easier than you think. People will agree to a meaningful metric. Eg, “Our time to interactive budget will be <=1 second” or “<= 1mb” (assuming the page weight is linked to a better UX) — https://speedcurve.com/blog/performance-budgets-in-action/
sw itch < < < < < < < < < < < < < 4s < < Load=DOMContentReady * With the wrong metric, you could be reporting incorrectly what real user experiences are like. * Apdex=1 while users silently suffer
Confluence * 0.5 + Trello * 0.3 Note: note real numbers This is useful for us Note: each product defines their own access Share: t values, expectations, etc. Customise: What a “meaningful” state or interaction means for your users
internet is a vehicle for free information for all * For me, $’s aren’t inspiring. But the free internet, the open exchange of information, is literally impossible for some people, unless the internet is fast.