SMX Paris 2026 - Focus on Your Audience, Not Your Keywords

Where’s Waldo? (I Don’t Care) Follow Your Audience Instead Amanda
King @ FLOQ Consulting / Topic Compass SMX Paris 9 Mar 2026

What’s what 1. Why? 2. Consolidate your content 3. Clean
out your archives 4. Build up your brand 5. Three takeaways 6. Who’s this human?

Before we get started…

We are consistently now in a world where we should
prioritise our audience—and translating that to how the algos think—more than “content velocity” or chasing “keyword ranking”

1. Google doesn’t like keywords

Google acknowledges query-only based matching is pretty terrible. “Direct “Boolean”
matching of query terms has well known limitations, and in particular does not identify documents that do not have the query terms, but have related words [...]The problem here is that conventional systems index documents based on individual terms, rather than on concepts. Concepts are often expressed in phrases [...] Accordingly, there is a need for an information retrieval system and methodology that can comprehensively identify phrases in a large scale corpus, index documents according to phrases, search and rank documents in accordance with their phrases, and provide additional clustering and descriptive information about the documents. [...]” - Information retrieval system for archiving multiple document versions, granted 2017 (link)

So it decided to make it’s search engine concept and
phrase-based. “The system is adapted to identify phrases that have sufficiently frequent and/or distinguished usage in the document collection to indicate that they are “valid” or “good” phrases [...]The system is further adapted to identify phrases that are related to each other, based on a phrase's ability to predict the presence of other phrases in a document.” - Information retrieval system for archiving multiple document versions, granted 2017 (link)

Queries very quickly become entities “[...]identifying queries in query data;
determining, in each of the queries, (i) an entity-descriptive portion that refers to an entity and (ii) a suffix; determining a count of a number of times the one or more queries were submitted“ - patent granted in 2015, submitted in 2012 Source: https://patents.google.com/patent/US9047278B1/en ; https://patents.google.com/patent/US20150161127A1/ , https://patents.google.com/patent/US8032507B1/en

Is there anything in this process that even looks like
“keywords”?

How natural language processing usually works: tokenization and subwords Source:
https://ai.googleblog.com/2021/12/a-fast-wordpiece-tokenization-system.html

• N-grams: important to find the primary concepts of the
sentence by identifying and excluding stop words • “Running” “runs” “ran” = same base — “run” This gets broken down even more https://patents.google.com/patent/US8423350B1/

From https://searchengineland.com/how-google-search-ranking-works-445141

“Rather than simply searching for content that matches individual words,
BERT comprehends how a combination of words expresses a complex idea.” Source: https://blog.google/products/search/how-ai-powers-great-search-results/

MUM takes this a step further • About 1,000 times
more powerful than BERT • Trained across 75 languages for greater context • Recognises this across different types of media (video, text, etc) https://blog.google/products/search/introducing-mum/

What is, then, is “information gain”? Phrase-based searching in an
information retrieval system, granted 2009 (link) ; “Contextual estimation of link information gain” granted to Google in Jul 2024 (link) [Australian Shepard] URL 1 URL 2 Aussie Aussie Red merle Blue merle Tricolor

And there’s this whole concept of consensus score Mark Williams-Cook
tested this with the Google exploit he analysed and got a bounty for. Source: https://www.youtube.com/watch?v=_AQ9UDqES80

2. Google personalises everything

Google ranks content on a lot of personal factors •
Based on historical behaviour from similar searches in aggregate (application) • Based on external links (link) • Based on your own previous searches (link) • Based on or not it should directly provide the answer via Knowledge Graph (link) • Phrase- and entity-based co-occurrence threshold scores (link) • Understanding intent based on contextual information (link)

Google is much more than a search engine. h/t Jes
Scholz for the visualisation concept Google Home Google Groups Google Discover Google Lens Google Arts & Culture Google News Google Assistant Google Play Google Images Google Videos Google Maps Google Shopping Google For Jobs Podcasts Google Travel Buy on Google Google Finance Google Books Google Classroom Google Search Gemini AI

3. We publish a lot of words

And because of that it’s getting harder to get indexed
in Google

4. Search behaviour is changing

In the US, Most growth appears in mid-length queries, particularly
6–9 word searches Searches 15+ words show more volatility than all other query lengths. https://datos.live/report/state-of-search-q4-2025/

First off, a reminder from Rand and SparkToro in Mar
2025.

Remember how these LLMs work • Trained on data up
to 30 Sep 2024 (GPT-5) • Trained on clean, plain text, stripped of formatting • It gives you the next most likely n-gram/word in the sequence • RAG (Retrieval-Augmented Generation) is used only when enabled for web, augments response based on that information

RAG/research/web “search” in LLMs is not live search in the
same way Google is*

So, wtf. Why are we talking about “not keywords”, if
LLM’s can’t even read my schema to understand an entity? Are keywords not the ideal?

It’s about patterns and the implicit understanding of training data
• Structured data and entity recognition is how you get on the shortlist to be a part of the RAG pipeline, or how you’re considered as a “good result” in the training data in the first place • When you codify how you talk about your brand and that’s consistent across channels, across media, across wherever you can control and influence, that’s a pattern LLM’s can recognise and interpolate

What’s all this about ‘fan out’ 1. Query becomes vectors
2. Use a decoder to create x number of variations on that query 3. Run these x query variations through a small, fast model simultaneously 4. Each variation returns small text snippets from a corpus 5. A confidence threshold (either fixed or dynamic) decides which results to keep 6. The selection process for these snippets is a black box - we don't control how they're chosen from the source pages

A quick note on agentic

And LLM’s don’t care about publish dates…

Less might be more Because we want to make sure
the content we do have is relevant to the industry and up-to-date

Having less content — done well — might actually be
to your benefit Site Size Domains Average Trafﬁc Gain Average Page Reduction Average Volatility Very Small (<100 pages) 10 47.01% -51.43% 25.36 Small (100-1K pages) 82 36.79% -35.85% 19.37 Medium (1K-10K pages) 158 46.55% -34% 20.39 Large (10K-100K pages) 121 77.67% -23.65% 27.33 Very Large (>100K pages) 20 57.59% -20.95% 28.33

Across 8,421 domains I reviewed data to see if reducing
pages was a stable, sustainable choice for growth 2022 August First helpful content update December Helpful content update Link spam update 2023 March Core update August Core update September Helpful content update October Core update November Core update Reviews update 2024 March Core update Spam update June Spam update November Core update December Spam update Core update August Core update Feb 2023 My data starts here

I aimed to play in the “middle of the road”
websites, not super massive ones Classiﬁcation Average Start Pages Average End Pages Average Absolute Page Change Average Relative Page Change Average Trafﬁc Change MPMT 16,130 26,910 10,780 81.40% 74.11% MPST 11,690 16,504 4,814 67.65% -6.15% FPLT 29,487 19,659 -9,828 -34.16% -55.92% SPMT 18,860 20,215 1,355 12.26% 50.78% SPLT 10,799 10,902 103 11.11% -51.01% MPLT 10,090 14,397 4,307 76.80% -46.09% SPST 15,222 15,532 310 10.45% -8.91% FPST 32,028 20,756 -11,272 -30.72% -10.85% FPMT 30,353 22,168 -8,185 -30.96% 54.71%

This was an interesting way to start the analysis Fewer
websites to work with isn’t necessarily a bad thing though Classification Count Percent Shutdown 1,742 20.69% More pages more traffic 1,445 17.16% More pages same traffic 922 10.95% Fewer pages less traffic 747 8.87% Same pages more traffic 724 8.60% Same pages less traffic 667 7.92% More pages less traffic 662 7.86% Same pages same traffic 653 7.75% Fewer pages same traffic 468 5.56% Fewer pages more traffic 391 4.64%

Reviewing specific industries show publications and YMYL tried this and
succeeded If you’re in • B2B • Medical • Style/fashion • Auto You may particularly benefit from reducing your pages - these industries, on average, performed better when they reduced pages than when they added more.

B2B Median Page Reduction: -15.99% Median Traffic Increase: 103.36% Pixels.com
Page reduction: -12.89% (961,615 → 837,710 pages) Traffic increase: +107.76% (339,573 → 705,502)

Medical Median Page Reduction: -8.99% Median Traffic Increase: 57.11% Stability
is remarkable - most of these sites have very low volatility (3-9%), indicating consistent growth rather than erratic traffic spikes.

Fashion Median Page Reduction: -31.95% Median Traffic Increase: 93.34% Flaunt.com
page reduction: -16.66% (10,945 → 9,122 pages) Traffic increase: +444.32% (48,033 → 261,451)

Auto Median Page Reduction: -21.68% Median Traffic Increase: 54.09% whatcar.com:
Gradual decline from ~13K to ~8.5K pages by end of 2024. Coincides with significant traffic growth.

Less is more—if you’re not sure, be super targeted in
what you consolidate Sweet spot: -10% to -20% reduction in content shows the highest traffic gains at 70.74%. Minimal page reductions (0-10%) produced substantial traffic gains (54.8%) with the highest stability rating (83.78%).

Let’s take a minute

Large sites (10K-100K pages) achieve dramatically higher traffic gains (77.67%)
compared to smaller sites, despite reducing a smaller percentage of their content (-24.24%).

Small sites typically require 34-51% page reduction; large sites achieve
better results with only about 20% reduction Site Size Domains Average Trafﬁc Gain Average Page Reduction Average Volatility Very Small (<100 pages) 10 47.01% -51.43% 25.36 Small (100-1K pages) 82 36.79% -35.85% 19.37 Medium (1K-10K pages) 158 46.55% -34% 20.39 Large (10K-100K pages) 121 77.67% -23.65% 27.33 Very Large (>100K pages) 20 57.59% -20.95% 28.33

Based on the patterns I’m seeing, gradual, specific page reductions
(likely content consolidation) are the more successful method to approach page reduction

Where YMYL starts, the rest of the Internet will likely
follow: plan to consolidate your content by 10-20% in the next 18 months.

Overall we’re still seeing the YMYL hypothesis hold Local Education
30% Trafﬁc Uplift 56% Success Rate Regional Finance 58% Trafﬁc Uplift 44% Success Rate

There are some industry exceptions where FPMT shines Regional Automotive
255% Trafﬁc Uplift 21% Success Rate Regional Sports 140% Trafﬁc Uplift 13% Success Rate

Size still matters • Very large sites (>100K pages): More
pages still win by 2-4x • Medium sites (1K-10K): Fewer pages sometimes better ◦ Globally distributed (11+ countries) ◦ In volatile/news-driven industries ◦ Have very low geographic concentration (HHI < 0.3) • The sweet spot remains 10-20% page reduction

So where does that leave content consolidation internationally? FPMT works
well when: • You're in UK/Commonwealth markets • You have 3-7 market presence • You're in volatile/news industries • You reduce 10-20% of pages

If you only come away with one insight… US-focused sites:
Use MPMT (+65% gain vs +42% for FPMT) UK/Commonwealth: Use FPMT (+165% gain) Emerging markets: Use MPMT (+129% gain) Already global: Reduce pages AND geographic spread together

Clean out your archives Steps to implement to be on
the more favourable end of reducing the pages on your website

When is the last time you’ve done a full content
inventory?

What I mean when I say content inventory https://www.portent.com/onetrick

Once we have that…what can we kill?

We know folks have failed doing this, so 1. Find
and resolve duplicate pages 2. Find and resolve irrelevant pages 3. Map and match user intent 4. Consolidate any and all with proper redirects, 404’s or 410s 5. BONUS: E-E-A-T updates, particularly if in YMYL

Find your duplicate content Do this at scale by using
a combination of tools: • Screaming Frog to crawl URLs • BigQuery, Python and FAISS (Facebook AI Similarity Search - big list of similar embeddings)

Or if you don’t have the time to do this
programmatically yourself, use tools Common tools include: • The duplicate content flag in Screaming Frog (this assumes you’re able to crawl your entire website in one go) • “Duplicate, Google chose different canonical than user” report in Google Search Console • Site audits in SEMRush or Ahrefs or Siteliner (free up to 250 pages)

Find your irrelevant content Analyse Search Console data at scale
in BigQuery Define your terms for: • Brand • Product • Topics SELECT page, query, SUM(impressions) AS total_impressions, SUM(clicks) AS total_clicks FROM `your_project.your_dataset.gsc_data` WHERE NOT REGEXP_CONTAINS(LOWER(query), r'(yourbrand|brand|band)') AND NOT REGEXP_CONTAINS(LOWER(query), r'(product1|product2|topic1)') GROUP BY page, query HAVING total_impressions > 50 -- adjust thresholds as needed ORDER BY total_impressions DESC;

But what if I’m not sure what my topics are?
Use the topics report in SEMRush (or similar) for a direction

If you have more brainspace than I do, you could
do this dynamically by automated relevance scoring with your brand proposition copy analysed against your query dataset using classifyText from Google’s Natural Language API

How Google classifies intent within the Search Quality Evaluator Guidelines
• Know query, some of which are Know Simple queries • Do query, when the user is trying to accomplish a goal or engage in an activity • Website query, when the user is looking for a specific website or webpage • Visit-in-person query, some of which are looking for a specific business or organization, some of which are looking for a category of businesses Refresh your memory of the Search Quality Evaluator Guidelines: https://static.googleusercontent.com/media/guidelines.raterhub.com/en//searchqualityevaluatorguidelines.pdf

Folk have dug even deeper into how Google actually classifies
intent in it’s search engine Use this to classify your primary keyword set …if you want to use it for more than a hundred or so queries maybe pick up an AlsoAsked subscription or ping MW-C 👀 Or join the request for an API… https://rqpredictor.streamlit.app/ ; https://www.linkedin.com/posts/markseo_seo-activity-7298698401955627008-VRQ6

And then once we classify the queries, we need to
check if the topic aligns to the primary user intent for the query

If it doesn’t… …Fix it.

Unhelpful Content Low Brand Visibility Monetizes Clicks Poor UX Low-Effort
Unpersonal Easy to Replicate Over-optimized SEO Helpful Content Good Brand Visibility Long-term Audience Unobtrusive UX High-Effort Personal Hard to Replicate Straightforward SEO

“You cannot produce original, insightful content that truly demonstrates experience
and trustworthiness by outsourcing all of your writing to a copywriter and publishing with minimal editing and no added insight. But for years, many businesses thrived on this model! You can’t add truly helpful graphics, unique images and video without extensive effort and extra cost, even if that cost is your time. You cannot create the type of content that people find worthy of bookmarking or sharing with others without significant effort.” Source: SEO in the Gemini Era, Dr. Marie Hynes

Build Your Brand Because nobody wants to buy from exactmatchdomain.com
any more

This is the kind of entity recognition we want But
it takes work: • They’ve been a company for 80 years • They have a Wikipedia page • They use organisation schema on their about page • They have thorough product details in Schema markup…to start.

It takes 30+ encounters for your brand to be properly
remembered by Google

It’s about reality, not theory. • What does the SERP
looks like. Check more than once • Talk to the customer service teams and sales reps • Use journey tracking tools like Fullstory • Do analysis in their analytics and review channel trends • Write more useful content (this can be a difficult conversation)

It’s a lot like Online Reputation Management (ORM). At a
glance, it’s a lot like owning the SERP. It’s amplification on ADHD meds. Because in order to get those 30 encounters, you will likely need to think beyond the bounds of your (client’s) website, with things like: • Podcasts • YouTube • An app • Interviews & press

It’s a lot like local search. Name, Address, Phone (NAP)
for entity optimisation is doing everything possible to facilitate those 30 encounters and making sure the information is consistent… …and then linking it all back through one of Google’s native languages, Schema markup.

It’s a lot like voice search and rich snippet optimisation.
• You have one result, rather than ten • The result presented may be incomplete, changed, or taken out of context • HTML formatting can matter if or in what way the information is presented

It’s a lot like brand building. • Evidence of reputation:
user engagement, popularity, user reviews on-site • Links and mentions: from authoritative places and topic experts • Popularity: Social media involvement, mentions in forums, comments around the web

Brush up on the (business) basics

Take a step back and think of everything SEO is
and includes

Does that sound like an acquisition channel focused on keyword
visibility?

And yet.

Use more representative metrics of the current state of visibility
in organic surfaces, like share of voice

And talk to your actual audience

Touch grass

3 takeaways To move forward with the new search experience
in step with the business 1. Google can understand your brand. You can’t fake it by ranking for “important” keywords. 2. SEO is no longer simply your clients website. It’s all SERP verticals. 3. Success in modern SEO requires a return to a traditional model: the brand.

Amanda King is human • 15+ years in the SEO
industry • Business- and product-focussed • AI & LLM forward strategies • Visited 40+ countries, lived in 3 • Always learning • Slightly obsessed with tea

Thank you LinkedIn: Amanda King, FLOQ https://www.linkedin.com/in/amandaecking/ Topic research: topiccompass.io
Growth consulting: floq.co

SMX Paris 2026 - Focus on Your Audience, Not Yo...

SMX Paris 2026 - Focus on Your Audience, Not Your Keywords

More Decks by Amanda King

Other Decks in Marketing & SEO

Featured

Transcript