News SEO is the cutting edge of all SEO

Slides from my talk at YoastCon 2023 where I showed some of the lessons I've learned about Google's inner workings from years of working with news publishers.

Barry Adams

May 11, 2023

  1. #YoastCon Advance Warning “The whole problem with the world is

    that fools and fanatics are always so certain of themselves and wiser people so full of doubts.” - Bertrand Russell
  2. #YoastCon Three ‘layers’ of Googlebot? Crawling Processing Render Queue Rendering

    Crawling Crawling Index Crawl Queue Crawl Queue Crawl Queue
  3. #YoastCon Realtime Crawler • Crawls VIPs ➢ Very Important Pages;

    Webpages that have a high change frequency and/or are seen as highly authoritative News website homepages & key section pages • Main purpose = discovery of valuable new content ➢ i.e., news articles • Rarely re-crawls newly discovered URLs ➢ Unless they’re new VIPs
  4. #YoastCon Regular Crawler • Google’s main crawler; ➢ Does most

    of the hard work ➢ Probably the crawler that fetches page resources
  5. #YoastCon Legacy Content Crawler • Crawls VUPs ➢ Very Unimportant

    Pages; URLs that have very little link value and/or are very rarely updated ➢ Re-crawls URLs that serve 4XX errors
  6. #YoastCon User-agent: Googlebot-News Disallow: / Robots.txt = Crawl Management …

    or is it? Robots.txt disallow for index management?!
  7. #YoastCon What can non-news sites learn from this? 1. Turn

    key pages into VIPs; Make them more valuable by; - Improving link value - Increasing change frequency 2. Use robots.txt disallow rules to manage indexing & ranking; For example, block Googlebot-Image to prevent product images from showing in Image search
  8. #YoastCon Indexing and Rendering Rendering takes time, and news doesn’t

    have time. Indexing is initially with raw HTML only. Crawl Queue Crawling Processing Index Render Queue Rendering Index Index
  9. #YoastCon Rendering isn’t the only shortcut… Google wants publishers to

    noindex syndicated content. Because Google sucks at identifying duplicate content. At least, it can’t de-duplicate quickly.
  10. #YoastCon Indexing is a multi-layered set of processes Render Queue

    Rendering Crawl Queue Crawling Processing Index Processing Processing Processing
  11. #YoastCon What about the Index itself? Render Queue Rendering Crawl

    Queue Crawling Processing Index Index Index
  12. #YoastCon Three Layers of Index Storage 1. RAM storage ➢

    Pages that need to be served quickly and frequently Includes news articles but also popular content 2. SSD storage ➢ Pages that are regularly served in SERPs but aren’t super popular 3. HDD storage ➢ Pages that are rarely (if ever) served in SERPs
  13. #YoastCon What can non-news sites learn from this? 1. Make

    indexing easy for Googlebot; Put all your critical content in the HTML source Don’t rely on rendering to load valuable content 2. There’s no such thing as a duplicate content penalty; However, duplicate content on a single site means the site is competing with itself… and that’s stupid.
  14. #YoastCon SERP Features • Many SERP features are volatile; ➢

    None more than Top Stories & other news boxes • Top Stories are triggered when two conditions are met; ➢ Sudden increase in search volume ➢ Sudden increase in publishing volume
  15. #YoastCon Who gets the Top Stories Top Spot? • Topical

    Authority • Authorship • E-E-A-T Expressions of the Knowledge Graph
  16. #YoastCon What can non-news sites learn from this? 1. Understand

    the intent behind the keywords you’re targeting; Don’t try to rank content that doesn’t match the intent If there are SERP features, try to get into those 2. Improve your Knowledge Graph presence; Category pages = topic hubs Use internal linking to your advantage Schema.org markup helps Google connect the dots
  17. #YoastCon News websites… … are crawled the most … are

    crawled the fastest … are indexed the quickest … are ranked according to the latest signals … are ranked based on the best interpretation of intent
  18. #YoastCon From News SEO you can learn… … how Google

    crawls websites … how Google indexes content … how Google evaluates quality and authority … how SERP features impact on ranking and traffic … and much, much more.