Upgrade to Pro — share decks privately, control downloads, hide ads and more …

(Almost) Everything You Need To Know About Crawling, Indexing, and Especially Rendering in Google

(Almost) Everything You Need To Know About Crawling, Indexing, and Especially Rendering in Google

Slides from my talk at Friends of Search 2022 in Amsterdam and Brussels where I spoke about crawling, indexing, and rendering in Google's search ecosystem.

00de107acb085244c96dbfe6da2b1560?s=128

Barry Adams

June 14, 2022
Tweet

More Decks by Barry Adams

Other Decks in Marketing & SEO

Transcript

  1. @badams #FOS22 (Almost) Everything You Need To Know About Crawling,

    Indexing, and Rendering in Google Barry Adams June 2022
  2. @badams #FOS22 What does Google do?

  3. @badams #FOS22 Crawler Indexer Ranker Google Processes

  4. @badams #FOS22 Crawler Indexer Ranker 1. Crawler (Googlebot)

  5. @badams #FOS22 @badams #FOS22 Crawling: Discovery

  6. @badams #FOS22 Crawling: Queue Management URL Deduplication

  7. @badams #FOS22 Crawling: Queue Management Prioritisation & Scheduling

  8. @badams #FOS22 @badams #FOS22 Crawling: Fetch & Parse

  9. @badams #FOS22 @badams #FOS22 Crawl Politeness

  10. @badams #FOS22 @badams #FOS22 Optimise Crawling • Server Response Time

  11. @badams #FOS22 @badams #FOS22 GSC Crawl Stats

  12. @badams #FOS22 @badams #FOS22 Page Resource Load

  13. @badams #FOS22 @badams #FOS22 Googlebot & AdsBot

  14. @badams #FOS22 @badams #FOS22 Optimise Crawling • Serve correct HTTP

    status codes; ➢ 200 OK ➢ 301 / 302 Redirects ➢ 304 Not Modified ➢ 401 / 403 Permission Issues ➢ 404 / 410 Not Found/Gone ➢ 5xx Error
  15. @badams #FOS22 @badams #FOS22 Optimise Crawling • ALL resources consume

    crawl budget; ➢ Not just HTML pages ➢ Reduce HTTP requests per page
  16. @badams #FOS22 @badams #FOS22 Optimise Crawling • ALL resources consume

    crawl budget; ➢ Not just HTML pages ➢ Reduce HTTP requests per page • AdsBot can consume crawl budget; ➢ Double-check your Google Ads campaigns
  17. @badams #FOS22 @badams #FOS22 Optimise Crawling • ALL resources consume

    crawl budget; ➢ Not just HTML pages ➢ Reduce HTTP requests per page • AdsBot can consume crawl budget; ➢ Double-check your Google Ads campaigns • Link equity (PageRank) impacts crawl budget; ➢ More link equity = more crawl budget
  18. @badams #FOS22 2. Indexer Crawler Indexer Ranker

  19. @badams #FOS22 @badams #FOS22 Two Stages* of Indexing Crawler Indexer

    Ranker 1 2 *At least – indexing is a collection of interconnected processes
  20. @badams #FOS22 @badams #FOS22 Indexing: HTML Lexer & Tokenizer

  21. @badams #FOS22 @badams #FOS22 Indexing: Selection

  22. @badams #FOS22 @badams #FOS22 Indexing: HTML Source

  23. @badams #FOS22 @badams #FOS22 Indexing: Rendering

  24. @badams #FOS22 Indexing: Index Integrity Deduplication & Canonicalisation

  25. @badams #FOS22 @badams #FOS22 Rendering

  26. @badams #FOS22 @badams #FOS22 Evergreen Chrome

  27. @badams #FOS22 @badams #FOS22 What happens during Rendering in your

    Browser? HTML CSS HTML Parser CSS Parser DOM Tree CSSOM Render Tree Painting Display Layout
  28. @badams #FOS22 @badams #FOS22 JavaScript HTML CSS HTML Parser CSS

    Parser DOM Tree CSSOM Render Tree Painting Display JavaScript Layout
  29. @badams #FOS22 @badams #FOS22 JavaScript… HTML CSS HTML Parser CSS

    Parser DOM Tree CSSOM Render Tree Painting Display JavaScript Layout JavaScript
  30. @badams #FOS22 @badams #FOS22 JavaScript… HTML CSS HTML Parser CSS

    Parser DOM Tree CSSOM Render Tree Painting Display JavaScript Layout JavaScript JavaScript
  31. @badams #FOS22 @badams #FOS22 JavaScript… HTML CSS HTML Parser CSS

    Parser DOM Tree CSSOM Render Tree Painting Display JavaScript Layout JavaScript JavaScript JavaScript
  32. @badams #FOS22 @badams #FOS22 Google’s Rendering as part of Indexing

    HTML CSS HTML Parser CSS Parser DOM Tree CSSOM Render Tree Painting Display JavaScript Layout JavaScript JavaScript
  33. @badams #FOS22 @badams #FOS22 Google does not perform actions

  34. @badams #FOS22 Why Rendering?

  35. @badams #FOS22 @badams #FOS22 Raw HTML:

  36. @badams #FOS22 @badams #FOS22 Rendered DOM:

  37. @badams #FOS22 @badams #FOS22 Rendering allows Google to… • …

    load all meta data, content, and links on a webpage • … understand the page’s layout and content hierarchy • … evaluate the usability and quality of the webpage
  38. @badams #FOS22 Rendering Issues

  39. @badams #FOS22 @badams #FOS22 Possible Rendering Issues in GSC

  40. @badams #FOS22 @badams #FOS22 Rendering Issues • Inaccessible Resources; ➢

    Make sure all page resources can be crawled
  41. @badams #FOS22 @badams #FOS22 Rendering Issues • JavaScript inserts invalid

    HTML in the <head>; ➢ <body> tags in the <head> break Google’s processing of meta tags
  42. @badams #FOS22 @badams #FOS22 Rendering Issues • JavaScript inserts invalid

    HTML in the <head>; ➢ <body> tags in the <head> break Google’s processing of meta tags
  43. @badams #FOS22 @badams #FOS22 https://developers.google.com/search/docs/advanced/guidelines/valid-html

  44. @badams #FOS22 @badams #FOS22 Rendering Issues • HTML vs Render

    mismatch; ➢ Different content in raw HTML vs fully rendered page
  45. @badams #FOS22 @badams #FOS22 https://chrome.google.com/webstore/detail/view-rendered- source/ejgngohbdedoabanmclafpkoogegdpob

  46. @badams #FOS22 @badams #FOS22 SEO Crawlers Can Also Render

  47. @badams #FOS22 @badams #FOS22 Google Tools *ALWAYS* Render

  48. @badams #FOS22 @badams #FOS22 Optimise Rendering • Don’t rely on

    Google’s rendering; ➢ Use SSR & CDN caching • Minimise page weight; ➢ Fewer page resources = better use of crawl budget faster load speed & CWV less chance of rendering issues • Optimise your HTML source; ➢ Think about where <script> tags exist and what they do when their code is executed
  49. @badams #FOS22 @badams #FOS22 Optimise Indexing • Optimise your page

    layouts; ➢ Prominent content & links are more valuable for users & Google • Improve internal linking; ➢ More PageRank = higher chance of indexing • Improve your content; ➢ Google has no obligation to index all your pages ➢ Make it worth Google’s while…
  50. @badams #FOS22 Bypassing Rendering* with Edge SEO *sort of

  51. @badams #FOS22 @badams #FOS22 Edge SEO Your Webserver Cloud CDNs

    Users
  52. @badams #FOS22 @badams #FOS22 Edge SEO Your Webserver Cloud CDNs

    Googlebot
  53. @badams #FOS22 @badams #FOS22 Edge SEO Your Webserver Cloud CDNs

    Googlebot Change your webpages here
  54. @badams #FOS22 @badams #FOS22 Edge SEO • CDNs store cached

    versions of your webpages; ➢ Global coverage with edge nodes worldwide ➢ Usually also results in faster crawling and better CWV • You manipulate your CDN cached pages; ➢ Cloud Workers enable a range of functionality • Googlebot crawls & indexes the changed CDN-cached pages; ➢ Your ‘original’ website remains unchanged ➢ Google only sees the changed CDN webpages
  55. @badams #FOS22 @badams #FOS22 Why Edge SEO? • Faster deployment;

    ➢ Bypass your developers’ lengthy queues ➢ ‘Ask forgiveness, not permission’ ➢ No reliance on client-side JavaScript • No CMS constraints; ➢ Change pages directly regardless of your CMS capabilities • Testing; ➢ Perform narrow tests on specific site sections ➢ A/B testing for SEO
  56. @badams #FOS22 @badams #FOS22 SEO A/B Split Testing

  57. @badams #FOS22 @badams #FOS22 SEO Split Testing Case Studies https://www.searchpilot.com/resources/newsletter/

  58. @badams #FOS22 @badams #FOS22 Barry Adams ➢ Doing SEO since

    1998 ➢ Specialist in Technical SEO & News SEO ➢ Newsletter: SEOforGoogleNews.com
  59. @badams #FOS22 Thank You barry@polemicdigital.com @badams