Slide 1

Slide 1 text

The Surprising Path to a Faster NYTimes.com Eitan Konigsburg Frontend Software Architect @eitanmk

Slide 2

Slide 2 text

Technology company? Tony Cenicola/The New York Times

Slide 3

Slide 3 text

Performance Golden Rule 80-90% of the end-user response time is spent on the frontend. Start there. - Steve Souders

Slide 4

Slide 4 text

It wasn’t always this way. Before the WPO Enlightenment, where was the performance focus?

Slide 5

Slide 5 text

The Backend.

Slide 6

Slide 6 text

NYTimes.com went online in 1996

Slide 7

Slide 7 text

Back then... ● Static pages ○ Less processing server-side meant faster responses ○ Very limited caching ○ Needed a lot of servers to scale ○ Handled traffic spikes caused by news events

Slide 8

Slide 8 text

Back then... ● Server-side dynamicism via a proprietary language called “Context” ○ HTML macros ○ Compiled to execute quickly ○ Execution delegated to special server software ○ Context server maintained “always on” database connections ○ It was fast

Slide 9

Slide 9 text

Back then... ● Ultimately switched to PHP ○ Never could match Context for live execution speed ○ Relegated to server side templating ○ Context remained to provide dynamic features

Slide 10

Slide 10 text

We created a monster

Slide 11

Slide 11 text

We created a monster ● Now have over 1 million static pages on disk ○ Only way to change them is to republish them ○ Would take ~90 days to re-publish all of them ○ Not feasible to do this for every code change ○ So we didn’t re-publish at all

Slide 12

Slide 12 text

It gets worse...

Slide 13

Slide 13 text

Permutations ● New variations of page markup created with each deployment ○ Older pages were not updated ○ Extrapolate over 10 years of development ○ Smallest code tweak created a lot of technical debt ● We have an unknown number of permutations ○ Maintenance nightmare!

Slide 14

Slide 14 text

WPO? We tried… ● Our unknown number of page permutations made optimizing the frontend almost impossible ○ Combined files together, but couldn’t remove references from older pages ○ Download sizes increased as we had to support older permutations ○ Needed low cache TTLs to ensure files got updated in a timely manner ● Made things worse, not better

Slide 15

Slide 15 text

Surprise #1 A lot of static pages are a barrier to WPO

Slide 16

Slide 16 text

Isn’t that obvious? ● Not mentioned often in performance discussions ○ Who would have 1M static pages? ● Didn’t become a barrier right away ○ Not every page we have is intended to live forever ○ Recent articles receive more traffic than older ones ● No practical alternative available ○ Couldn’t avoid the static page problem

Slide 17

Slide 17 text

Shift + Refresh

Slide 18

Slide 18 text

2013: A Rare Opportunity ● Given a product pause to fix major infrastructural debt ○ This doesn’t happen very often (if at all) ● Nothing was safe: ○ Infrastructure ○ Server configuration ○ Code organization ○ Development process ○ Build process ○ Deployment process

Slide 19

Slide 19 text

Going dynamic ● Slowly migrating all content into databases ● Ongoing process ○ Some things will be dynamic sooner than others ● Use Varnish cache to hide backend slowness ○ We get great TTFB times with Varnish ○ Simple caching strategy ○ All per-user customizations pushed client-side

Slide 20

Slide 20 text

Surprise #2 Performance increase mandated as part of redesign

Slide 21

Slide 21 text

Supported from the top ● First project to have a performance goal as part of its definition of success ● Why all of a sudden? ○ Performance became a factor in SEO ○ NYT became an E-Commerce site since last redesign ● Conservative goal ○ A certain percentage faster than the old site ○ Attainable, but wouldn’t risk shipping the product

Slide 22

Slide 22 text

Where to begin?

Slide 23

Slide 23 text

Conflicting advice ● Basic rules sometimes contradict ○ Reduce number of HTTP requests, but not all the way so you can take advantage of parallel downloads ○ Combine files together, but soon, with HTTP/2.0 doing so will be less than optimal ○ Set long cache TTLs, but remove unused code ● “Tools, not rules” ○ Create the product, then inspect, then optimize ○ Need to have a product first

Slide 24

Slide 24 text

Fundamental overhaul ● RequireJS for non-blocking and asynchronous loading ○ One blocking script needed in the head ● Modern build system to concatenate and minify files ○ Automatic sprite atlas creation ● New caching strategy ○ Timestamp in the URL ○ Inserted dynamically by the backend application ○ New file downloaded after code push ○ Far-future cache TTL

Slide 25

Slide 25 text

Lazy loaded ads ● Worthwhile investment ● Before the redesign, ads were part of the markup and would block rendering ○ Often the slowest part of our page load ● Injected ads into iframes which worked surprisingly well ● Minor issue: differences in browsers regarding how to track history in embedded frames ● Blogged about this: http://nyti.ms/1qgoJ0b ● Allowed us to hit DOMReady really quickly

Slide 26

Slide 26 text

Surprise #3 Sometimes you have to slow down to seem faster

Slide 27

Slide 27 text

Don’t move the things ● Want to avoid moving content around based on the ads ● Preferable to show an ad above the fold, if present ● If no ad, want to use the space for other content ● Bonus: use NYT web fonts to make the headline look like the newspaper ○ Browsers differ on how to handle text while a font downloads - FOUT ● If we were fully asynchronous, the content could shift around during load which looks and feels slow

Slide 28

Slide 28 text

Variant #1

Slide 29

Slide 29 text

Variant #2

Slide 30

Slide 30 text

Initial solution 1. Request the ads synchronously 2. Inspect the response to see if there is an ad to draw 3. Add a class to the HTML to determine which layout to show 4. Continue loading the page ● Requires an intentionally (!) blocking script to achieve this ● While it slowed down the page, things didn’t shift around ● The perception mattered more than the numbers

Slide 31

Slide 31 text

Unexpected Surprise Discovered another solution

Slide 32

Slide 32 text

Even better solution ● Some of the time, make room for an ad ○ Otherwise, use full width for content ● Allows for both goals: ○ Prevents content shifting ○ Maximizing ad placement ● Ad request can be asynchronous again! ● Drawback: sometimes space can be made for an ad that is never served ○ Tweak the numbers to minimize this

Slide 33

Slide 33 text

Profit

Slide 34

Slide 34 text

New possibilities Things we’re exploring

Slide 35

Slide 35 text

Measure and improve ● Smoother animations ○ Scroll event listeners causing jank ○ Exploring patterns involving requestAnimationFrame ○ Promoting some of our fixed position elements to the GPU ○ Optimizing code paths to achieve better FPS ○ Challenge: requires refactoring large portions of the codebase

Slide 36

Slide 36 text

Measure and improve ● Memory management ○ We have to load a lot of content ○ Approaching memory limits on some mobile devices ○ There are a lot of candidates for things retaining too much memory ○ Challenge: tools are complex but getting better ○ Challenge: sometimes the memory issue is difficult to track down

Slide 37

Slide 37 text

Measure and improve ● Above-the-fold, critical path optimizations ○ Inlining of critical CSS ○ Lazy loading of other resources ○ Keeping 3rd-party code out of ○ Challenge: editorial desire to put nice interactive material above the fold ○ Challenge: A/B testing support

Slide 38

Slide 38 text

Other things we’re following ● HTTPS ○ Interested in using HTTPS by default ○ Huge hurdles to overcome to make this a reality ○ All that static content is still holding us back ● ○ Will be a huge help to our responsive site ● Web Components ○ Following development on this closely ○ Interested in seeing whether this can be used for ads ● ServiceWorker (pretty please???) ○ Better offline access

Slide 39

Slide 39 text

Developer portal: http://developers.nytimes.com/ Open Blog: http://open.blogs.nytimes.com/ Twitter: @nytdevs Other NYTDevs slides: https://speakerdeck.com/nytdevs/

Slide 40

Slide 40 text

Thanks! @eitanmk