The Most Common Technical SEO Issues for News Publishers
Slides from my talk at the 2025 News Reach conference in Dortmund, where I spoke about the most common technical SEO issues affecting news and publishing websites.
focused on the technical aspects of a website that enable crawling, indexing, and ranking of webpages in search results. Everything to do with SEO that’s not content or marketing/brand.
➢ Old URLs should redirect to new URLs to preserve inbound links • But internal links should always point directly to final destination URLs; ➢ Minimise crawl waste on redirected URLs
Google crawls the alternative URLs, only then to be informed it’s not allowed to index them; ➢ Still consumes crawl effort • Use canonicalisation wisely
be in the HTML response • Google will eventually render the page and see JS-loaded content; ➢ However, this can be a delayed process ➢ The article is relatively old by then ➢ Google strongly prefers fresh articles in Top Stories & Google News
Infinite scroll ➢ Client-side JavaScript ➢ Paginated series canonicalised to the 1st page <link rel="canonical" href="https://www.website.com/category/" /> Load More Articles Google doesn’t perform actions No clicking or scrolling
category / tag page collects articles around a specific topic; ➢ More articles = more topic authority ➢ We want Google to see a decent body of journalism on specific topics to show you are an expert on those topics
lacks semantic tags • Use the most appropriate HTML tag for every element; ➢ This helps Google identify the important aspects of the page ➢ Also critical for accessibility
is the wrong size & aspect ratio; ➢ Ideally, SD defines 3 images: -16:9 (1200 x 675) -4:3 (1200 x 900) -1:1 (1200 x 1200) • Too many images; ➢ In my experience, too many images defined in the Article SD confuses Google ➢ Keep it to three – no more, no less ➢ If you have just one image, make it the 16:9 version
➢ Feeds directly into their rankings • Bad CWV = bad usability; ➢ Bad usability = negative click signals Img credit: https://www.mariehaynes.com/navboost/
Ensuring your content can be extracted and ingested by LLMs for training purposes 2. Retrieval Augmented Generation (RAG); ➢ Ensuring your URLs are cited as sources in LLM answers
Make things easy for Google; ➢ URLs are sacred ➢ Be careful with canonicalisation ➢ Load all content in the HTML ➢ Make your pagination indexable ➢ Use semantic HTML tags ➢ Short & focused structured data ➢ Correct image sizes for Google