Think like a bot,
Rank like a boss:
How Googlebot renders
Jamie Alberico // Not a Robot
SLIDESHARE.NET/JAMIEALBERICO
@JAMMER_VOLTS
Slide 2
Slide 2 text
Jamie Alberico
My name means Usurper Elf
King.
I’m a Technical SEO, Search
Advocate, & Wood Elf Druid.
Oh yeah, and I’m Not a
Robot.
#brightonSEO @Jammer_Volts
Slide 3
Slide 3 text
Masters of unlocking magic in everyday
objects, Technical SEOs are extremely
resourceful.
They see magic as a complex system
waiting to be decoded and controlled.
Proficiencies (recommended)
Chrome Developer Tools, Lighthouse,
Google Search Console, webcrawlers
Technical SEOs
Class Details
#brightonSEO @Jammer_Volts
Slide 4
Slide 4 text
Our Technical
SEO Quest
To protect site visibility by delivering
our content to Google’s index.
To do this, we must pass
through a powerful construct.
#brightonSEO @Jammer_Volts
Slide 5
Slide 5 text
When Googlebot retrieves your pages,
Googlebot runs your code, and assess your
content to understand the layout or structure of
your site.
What is Rendering?
#brightonSEO @Jammer_Volts
Slide 6
Slide 6 text
All information Google collects during the
rendering process is then used to rank the quality
and value of your site content against other sites
and what people are searching for with Google
Search.
How Google Search Works, Search Console Help Center
Rendering’s role in Rank
#brightonSEO @Jammer_Volts
Slide 7
Slide 7 text
Initial HTML
(1st wave of indexing)
Rendered HTML
(2nd Wave of indexing)
Rendering
#brightonSEO @Jammer_Volts
Slide 8
Slide 8 text
If Google cannot render the pages on
your site, it becomes more difficult to
understand your web content because we
are missing key visual layout information
about your web pages.
As a result, the visibility of your site
content in Google Search can suffer.
Rendering Risks
#brightonSEO @Jammer_Volts
Slide 9
Slide 9 text
Until 2018, we thought our quest looked
like this
Crawl
Index
Rank
#brightonSEO @Jammer_Volts
Slide 10
Slide 10 text
Now, we know that Rendering is part of the process
and that Google has two waves of indexing.
Crawl Index
Render
Rank
First Wave
Second Wave
#brightonSEO @Jammer_Volts
Slide 11
Slide 11 text
If Google can’t render content, we fail our quest
Crawl Index
#brightonSEO @Jammer_Volts
Slide 12
Slide 12 text
Google’s Web
Rendering Service
(WRS)
Insight Check
#brightonSEO @Jammer_Volts
Slide 13
Slide 13 text
Google Web Rendering Service
Large Construct (legendary), lawful neutral
Languages HTML, CSS, JavaScript, Images
Skills Perception +12, Dexterity +10
Senses Robots.txt, Robots directives
#brightonSEO @Jammer_Volts
Slide 14
Slide 14 text
Takes action using threads
Each requests to made by a thread. A thread is a single
connection. It sequentially moves through each action,
one at a time, until it’s task is complete.
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 15
Slide 15 text
SEOs call this Crawl Budget
“Simply put, [crawl budget] represents the number of
simultaneous parallel connections Googlebot may use to
crawl the site, as well as the time it has to wait between
the fetches.”
What Crawl Budget Means for Googlebot, Google Webmaster Blog
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 16
Slide 16 text
Stateless
● Does not retain state across page loads
● Local Storage and Session Storage data are cleared
across pages loads
● HTTP Cookies are cleared across page loads
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 17
Slide 17 text
Obedient
Obeys HTML/HTML5 protocol
Literal
“Googlebot, go to the apothecary and buy a
healing potion. If they have shields, buy 2. “
Googlebot comes back with 2 potions.
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 18
Slide 18 text
Politeness is priority 0
Crawling is its main priority while making sure it doesn't
degrade the experience of users visiting the site. We call
this the "crawl rate limit," which limits the maximum
fetching rate for a given site.
What Crawl Budget Means for Googlebot, Google Webmaster Central Blog
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 19
Slide 19 text
Multi-thread
Googlebot can execute more than one request at a time
if demand and server stability allows.
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 20
Slide 20 text
Request URI
Googlebot send a request for content at a unique
resource instance (URI).
Googlebot can discover a URL
via link or submission
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 21
Slide 21 text
Read HTTP response and headers
Q. Does the thing I asked for exist?
A. HTTP Status Codes
Q. Anything I should know before looking at this?
A. Cache-Control, and Directives
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 22
Slide 22 text
Parse
Download the response from
server
Features & Traits Equipment Actions
#brightonSEO @Jammer_Volts
Slide 23
Slide 23 text
Identify Resources
Googlebot identifies resources
needed to complete the request.
It feeds identified resources into
the crawling queue.
Features & Traits Actions Equipment
Use Network tab to see how many
resources a page calls
#brightonSEO @Jammer_Volts
Slide 24
Slide 24 text
Cache
If the requested website implements a cache, a copy of
the data is made or requested
Features & Traits Actions Equipment
#brightonSEO @Jammer_Volts
Slide 25
Slide 25 text
Actions
WRS, web rendering service
Features & Traits Equipment
Googlebot queues pages for both crawling and rendering. It is not
immediately obvious when a page is waiting for crawling and when it is
waiting for rendering.
WRS is the name used to represent the collective elements involved in
Google’s rendering service. Many details are not publically available.
#brightonSEO @Jammer_Volts
Slide 26
Slide 26 text
Web Rendering Service (WRS)
Blink Browser Engine
V8 Rendering Engine
Ignition TurboFan Liftoff Display backend
Google Magic
Chromium Headless Browser
#brightonSEO @Jammer_Volts
Slide 27
Slide 27 text
Actions
WRS process
Features & Traits Equipment
1. A URL is pulled from the crawl queue
2. Googlebot requests the URL and downloads the initial HTML
3. The Initial HTML is passed to the processing stage which extracts links
4. Links go back on the crawl queue
5. Once resources are crawled, the page queues for rendering
#brightonSEO @Jammer_Volts
Slide 28
Slide 28 text
Actions
WRS process
Features & Traits Equipment
6. When resources become available, the request moves from the render
queue to the renderer
7. Renderer passes the rendered HTML back to processing
8. Processing indexes the content
9. Extracts links from the rendered HTML to put them into the crawl
queue
#brightonSEO @Jammer_Volts
Slide 29
Slide 29 text
Chromium, headless browser
Equipment
Actions
Features & Traits
● Headless means that there is no GUI (visual representation)
● Used to load web pages and extract metadata
● reading from and writing to the DOM
● observing network events
● capturing screenshots
● inspecting worker scripts
● recording Chrome Traces
#brightonSEO @Jammer_Volts
Slide 30
Slide 30 text
Blink, browser engine
● Allows for querying and manipulating the rendering
engine settings (ex: mobile vs. desktop)
● Blink loves service workers. Blink may create multiple
worker threads to run Web Workers, ServiceWorker
and Worklets
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 31
Slide 31 text
Blink, browser engine
Blink is responsible for 2 major elements:
Memory heap: stores the result of script execution
(Memory Heap results are added to DOM.)
Call stack: queue of sequential next steps
(Each entry in the call stack is called a Stack Frame.)
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 32
Slide 32 text
Blink, browser engine
Local storage and Session storage are key-value pairs
that can store any JS objects and functions in the
browser
These keys are a weak point in your rendering offense
against a stateless Googlebot.
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 33
Slide 33 text
V8, JavaScript engine
JavaScript is a single-threaded process and each entry or
execution step is a stack frame.
Googlebot can opt run simultaneous parallel
connections.
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 34
Slide 34 text
V8, JavaScript engine
Each thread will runs through a process of:
1. Loading
2. Parsing
3. Compiling
4. Executing
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 35
Slide 35 text
V8, JavaScript engine
● open-source JavaScript engine and WebAssembly
engine
● developed by Google & The Chromium Project
● Use in Node.js, Google Chrome, and Chromium web
browsers
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 36
Slide 36 text
V8’s components
● Ignition, a fast low-level register-based JavaScript
interpreter written using the backend of TurboFan
● TurboFan, one of V8’s optimizing compilers
● Liftoff, a new baseline compiler for WebAssembly
Equipment
Actions
Features & Traits
#brightonSEO @Jammer_Volts
Slide 37
Slide 37 text
Optimized Rendering
Roll with Advantage
#brightonSEO @Jammer_Volts
Slide 38
Slide 38 text
Parse content critical to
user intent in initial HTML
#brightonSEO @Jammer_Volts
Slide 39
Slide 39 text
Crawl
Index
Render
HTML
DOM
1st wave of
indexing
2nd wave of
indexing
#brightonSEO @Jammer_Volts
Slide 40
Slide 40 text
Critical = why the user came
#brightonSEO @Jammer_Volts
Slide 41
Slide 41 text
Define it for your site, by template
#brightonSEO @Jammer_Volts
Slide 42
Slide 42 text
Use clean, consistent signals
Googlebot won’t see past a noindex directive in initial HTML
to see an index placed in DOM.
Duplicative content without a canonical in initial HTML is
crawl waste until rendering.
Inconsistent title tags and descriptions can result from
overwriting the initial HTML with rendered HTML.
#brightonSEO @Jammer_Volts
Slide 43
Slide 43 text
Focus rendering efforts with nofollow
If a resource is not valuable to the construction of the page,
add a nofollow directive to resources that are not necessary
or beneficial to page construction.
#brightonSEO @Jammer_Volts
Slide 44
Slide 44 text
Mobile vs Desktop Rendering
Layout matters for both.
If you want to rank for
position zero, remember that
the content must be exposed
on initial mobile load.
#brightonSEO @Jammer_Volts
Slide 45
Slide 45 text
Choose the rendering strategy that’s
right for your business and stack.
You don’t have to be 100% client-side, 100% server-side, or
100% both (dynamic).
Load what matters when it matters.
#brightonSEO @Jammer_Volts
Rendering
&
Performance
DOM
HTML
Style
Sheets
HTML
Parser
CSS
Parser
DOM
Tree
Style
Rules
Render
Tree
Attachment
Layout
Painting Display
TTFB
TTI
#brightonSEO @Jammer_Volts
Slide 50
Slide 50 text
#brightonSEO @Jammer_Volts
Slide 51
Slide 51 text
More Pages Resources require more
rendering resources
Each resource must be fetched independently before the
page can be accurately rendered.
This is a major part of the issue with client-side rendering.
More client-side calls mean more blindspots for you.
#brightonSEO @Jammer_Volts
Slide 52
Slide 52 text
Excessive scripts
runs the risk of
hitting thread/rest
thresholds.
This is most often
observed as Other
error .
#brightonSEO @Jammer_Volts
Slide 53
Slide 53 text
Call Stacks have a maximum size
While the Call Stack has functions to execute, the browser
can’t actually do anything else — it’s getting blocked.
#brightonSEO @Jammer_Volts
Slide 54
Slide 54 text
Session and Local web storage limits
5MB per object, and 50MB per system
If your CSR resources are too large, you risk hitting the upper
limit. Elements in queue once the limit is reached may not be
considered by Googlebot.
#brightonSEO @Jammer_Volts
Slide 55
Slide 55 text
Load scripts & images without blocking
Asynchronous calls are supported with async attributes
Lazy load images in Chrome with native attributes
#brightonSEO @Jammer_Volts
Slide 56
Slide 56 text
Broken Structured Data Markup
#brightonSEO @Jammer_Volts
Slide 57
Slide 57 text
Don’t trust document.write( )
Dynamic code (such as script elements containing
document.write() calls) can add extra tokens, so the parsing
process actually modifies the input.
#brightonSEO @Jammer_Volts
Slide 58
Slide 58 text
Render Testing
Check for traps
#brightonSEO @Jammer_Volts
Slide 59
Slide 59 text
Test local/firewalled with tunneling
SimpleHTTPServer (http.server in Python 3) is a Simple HTTP
request handler for QA
#brightonSEO @Jammer_Volts
Slide 60
Slide 60 text
Test local or firewalled with tunneling
ngrok exposes that page on a publicly accessible URL
#brightonSEO @Jammer_Volts
Slide 61
Slide 61 text
#brightonSEO @Jammer_Volts
Slide 62
Slide 62 text
#brightonSEO @Jammer_Volts
Slide 63
Slide 63 text
#brightonSEO @Jammer_Volts
Slide 64
Slide 64 text
#brightonSEO @Jammer_Volts
Make Allies
Slide 65
Slide 65 text
| ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄|
DON'T BE AFRAID
TO LEARN IN PUBLIC
|___________|
(\__/) ||
(•ㅅ•) ||
/ づ
#brightonSEO @Jammer_Volts
Slide 66
Slide 66 text
Resources
● Get started with Chrome Developer
Tools
● HTML/HTML5 Parsing Standards
● Debugging your pages
● SimpleHTTPServer
● Ngrok
● Fix Search-related JavaScript
problems
● TurboFan overview
● Liftover overview
● Tame the Bots Portals
● Blink Rendering, life of a pixel
● The Rendering Critical Path
● JavaScript Sites in Search Working Group
#brightonSEO @Jammer_Volts