Slide 1

Slide 1 text

#YoastCon #YoastCon News is at the cutting edge of SEO Barry Adams May 2023

Slide 2

Slide 2 text

#YoastCon Advance Warning “The whole problem with the world is that fools and fanatics are always so certain of themselves and wiser people so full of doubts.” - Bertrand Russell

Slide 3

Slide 3 text

#YoastCon #YoastCon

Slide 4

Slide 4 text

#YoastCon #YoastCon How does Google work?

Slide 5

Slide 5 text

#YoastCon #YoastCon Web Search Engines Crawling Indexing Ranking

Slide 6

Slide 6 text


Slide 7

Slide 7 text

#YoastCon Google’s model Crawl Queue Crawling Processing Index Render Queue Rendering Index Index

Slide 8

Slide 8 text

#YoastCon #YoastCon 1. Crawling Ranking Crawling Indexing

Slide 9

Slide 9 text


Slide 10

Slide 10 text

#YoastCon Three ‘layers’ of Googlebot? Crawling Processing Render Queue Rendering Crawling Crawling Index Crawl Queue Crawl Queue Crawl Queue

Slide 11

Slide 11 text

#YoastCon Three ‘layers’ of Googlebot 1. Realtime crawler 2. Regular crawler 3. Legacy content crawler

Slide 12

Slide 12 text

#YoastCon Realtime Crawler • Crawls VIPs ➢ Very Important Pages; Webpages that have a high change frequency and/or are seen as highly authoritative News website homepages & key section pages • Main purpose = discovery of valuable new content ➢ i.e., news articles • Rarely re-crawls newly discovered URLs ➢ Unless they’re new VIPs

Slide 13

Slide 13 text

#YoastCon Regular Crawler • Google’s main crawler; ➢ Does most of the hard work ➢ Probably the crawler that fetches page resources

Slide 14

Slide 14 text

#YoastCon Legacy Content Crawler • Crawls VUPs ➢ Very Unimportant Pages; URLs that have very little link value and/or are very rarely updated ➢ Re-crawls URLs that serve 4XX errors

Slide 15

Slide 15 text

#YoastCon Robots.txt = Crawl Management … or is it? User-agent: Googlebot-News Disallow: /

Slide 16

Slide 16 text

#YoastCon User-agent: Googlebot-News Disallow: / Robots.txt = Crawl Management … or is it? Robots.txt disallow for index management?!

Slide 17

Slide 17 text

#YoastCon What can non-news sites learn from this? 1. Turn key pages into VIPs; Make them more valuable by; - Improving link value - Increasing change frequency 2. Use robots.txt disallow rules to manage indexing & ranking; For example, block Googlebot-Image to prevent product images from showing in Image search

Slide 18

Slide 18 text

#YoastCon #YoastCon 2. Indexing Crawling Indexing Ranking

Slide 19

Slide 19 text

#YoastCon Indexing and Rendering Crawl Queue Crawling Processing Index Render Queue Rendering Index Index

Slide 20

Slide 20 text

#YoastCon Indexing and Rendering Render Queue Rendering Crawl Queue Crawling Processing Index Index Index

Slide 21

Slide 21 text

#YoastCon Indexing and Rendering Rendering takes time, and news doesn’t have time. Indexing is initially with raw HTML only. Crawl Queue Crawling Processing Index Render Queue Rendering Index Index

Slide 22

Slide 22 text

#YoastCon Rendering isn’t the only shortcut… Google wants publishers to noindex syndicated content. Because Google sucks at identifying duplicate content. At least, it can’t de-duplicate quickly.

Slide 23

Slide 23 text

#YoastCon Indexing is a multi-layered set of processes Render Queue Rendering Crawl Queue Crawling Processing Index Processing Processing Processing

Slide 24

Slide 24 text

#YoastCon What about the Index itself? Render Queue Rendering Crawl Queue Crawling Processing Index Index Index

Slide 25

Slide 25 text

#YoastCon Three Crawlers… Three Indices? Realtime crawler Regular crawler Legacy content crawler RAM storage SSD storage HDD storage

Slide 26

Slide 26 text

#YoastCon Three Layers of Index Storage 1. RAM storage ➢ Pages that need to be served quickly and frequently Includes news articles but also popular content 2. SSD storage ➢ Pages that are regularly served in SERPs but aren’t super popular 3. HDD storage ➢ Pages that are rarely (if ever) served in SERPs

Slide 27

Slide 27 text

#YoastCon It’s probably more complicated Realtime crawler Regular crawler Legacy content crawler RAM storage SSD storage HDD storage

Slide 28

Slide 28 text

#YoastCon What can non-news sites learn from this? 1. Make indexing easy for Googlebot; Put all your critical content in the HTML source Don’t rely on rendering to load valuable content 2. There’s no such thing as a duplicate content penalty; However, duplicate content on a single site means the site is competing with itself… and that’s stupid.

Slide 29

Slide 29 text

#YoastCon #YoastCon 3. Ranking Crawling Indexing Ranking

Slide 30

Slide 30 text

#YoastCon Search Intent – first BERT, then MUM

Slide 31

Slide 31 text


Slide 32

Slide 32 text

#YoastCon However…

Slide 33

Slide 33 text

#YoastCon Most tools are out of date

Slide 34

Slide 34 text

#YoastCon Google Trends is better, but lacks numbers

Slide 35

Slide 35 text

#YoastCon Very few tools are (near) real-time

Slide 36

Slide 36 text

#YoastCon And even fewer accurately report on SERP features

Slide 37

Slide 37 text


Slide 38

Slide 38 text

#YoastCon SERP Features • Many SERP features are volatile; ➢ None more than Top Stories & other news boxes • Top Stories are triggered when two conditions are met; ➢ Sudden increase in search volume ➢ Sudden increase in publishing volume

Slide 39

Slide 39 text

#YoastCon SERP Features and CTR

Slide 40

Slide 40 text

#YoastCon Who gets the Top Stories Top Spot? • Topical Authority • Authorship • E-E-A-T Expressions of the Knowledge Graph

Slide 41

Slide 41 text

#YoastCon Knowledge Graph Arnold Schwarzenegger Bodybuilding Predator (1987) Governor of California Maria Shriver Ronnie Coleman JFK

Slide 42

Slide 42 text

#YoastCon Knowledge Graph Arnold Schwarzenegger Bodybuilding … … … Predator (1987)

Slide 43

Slide 43 text

#YoastCon Internal Linking to Topic Hubs

Slide 44

Slide 44 text

#YoastCon Knowledge Graph Arnold Schwarzenegger Predator (1987) … … … Bodybuilding

Slide 45

Slide 45 text


Slide 46

Slide 46 text

#YoastCon What can non-news sites learn from this? 1. Understand the intent behind the keywords you’re targeting; Don’t try to rank content that doesn’t match the intent If there are SERP features, try to get into those 2. Improve your Knowledge Graph presence; Category pages = topic hubs Use internal linking to your advantage markup helps Google connect the dots

Slide 47

Slide 47 text

#YoastCon #YoastCon So why is news at the cutting edge of SEO?

Slide 48

Slide 48 text

#YoastCon News websites… … are crawled the most … are crawled the fastest … are indexed the quickest … are ranked according to the latest signals … are ranked based on the best interpretation of intent

Slide 49

Slide 49 text

#YoastCon From News SEO you can learn… … how Google crawls websites … how Google indexes content … how Google evaluates quality and authority … how SERP features impact on ranking and traffic … and much, much more.

Slide 50

Slide 50 text


Slide 51

Slide 51 text

#YoastCon #YoastCon Thank You @badams /in/barryadams/