Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Debunking and Demystifying Generative Informati...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Debunking and Demystifying Generative Information Retrieval

The birth of Generative Information Retrieval was 2023, so we're only a couple of years into the biggest paradigm shift the search world has experienced. Inevitably there's much confusion, mysticism and mythology filling the void of accurate understanding. In this talk you'll learn some of the common jargon - what really matters and what is simply noise (and even perhaps spam which might lead you into hot water). From chunking to vector databases and llms.txt.

Avatar for Dawn Anderson

Dawn Anderson

February 05, 2026
Tweet

More Decks by Dawn Anderson

Other Decks in Marketing & SEO

Transcript

  1. The arrival of 'AI Search' has exacerbated further with 'SEO

    AI' myths we now need to question too #WTSFest @dawnieando
  2. Some SEO AI Myths We Need to Employ Critical Thinking

    About Chunking llms.txt Information Gain AI Replaces SEO and more #WTSFest @dawnieando
  3. Why SEO Myths Spread RAPID EVOLUTION SOCIAL ECHO CHAMBERS PATENTS

    ≠ PRODUCTION DESIRE FOR SIMPLE LEVERS AMBIGUOUS TERMINOLOGY #WTSFest @dawnieando
  4. The Magic Number 7 'Chunking' and Information Theory George Miller

    (1956) Magic Number 7 - +/- 2 Chunks help users commit information to working memory #WTSFest @dawnieando
  5. Myth: Google Ranks Chunks • SEOs believe sections rank independently

    (Passage indexing) • Reality: chunking = ML preprocessing • Google evaluates pages holistically & by intent nowadays #WTSFest @dawnieando
  6. Chunking in Machine Learning Chunking is a preprocessing technique in

    machine learning and NLP Large documents are split into smaller, manageable segments ('chunks') Helps models process context within token limits Common in LLM-based retrieval, embeddings, and summarisation systems #WTSFest @dawnieando
  7. Why Chunking Is Used in ML? Language models have token

    limits Chunking ensures no loss of information Enables vector search and retrieval- augmented generation (RAG) Supports knowledge retrieval rather than ranking or quality evaluation #WTSFest @dawnieando
  8. Several Types of ML Chunking Too Method Semantic quality Control

    Speed Best for Fixed-size Low–medium High Very fast Embeddings, RAG Sentence/paragraph High Medium Fast Summaries, QA Semantic Very high Medium-low Medium Knowledge retrieval Sliding window Medium High Medium Long-context tasks Hierarchical High Medium Medium Structured documents Model-specific Varies High Varies LLMs with long context Adaptive token-aware High High Medium Production RAG #WTSFest @dawnieando
  9. Why Chunking Is Misunderstood LLMs use chunks ML terms misread

    as actionable SEO tactics Narrative simplicity #WTSFest @dawnieando
  10. Chunking Summary • Chunking is an ML processing technique—not an

    SEO tactic • It has no direct or indirect ranking impact • SEO success comes from content quality, relevance, and site performance • Ignore the hype; stay focused on proven ranking factors #WTSFest @dawnieando
  11. What to Focus On Instead Clear, readable, user-first content Strong

    topical authority and internal linking Satisfying user intent Technical performance and UX Structured data, accessibility, crawlability #WTSFest @dawnieando
  12. What LLMs.txt Is LLM friendly markdown file It's nothing like

    robots.txt It provides easily consumable excerpts of the most important pages on a website Lots of confusion over its purpose It's not for crawling or LLM controlling #WTSFest @dawnieando
  13. Myth: Information Gain Ranking Factor Belief: unique information = ranking

    gains Reality: patent concept, not necessarily deployed Truth - Google prefers helpfulness #WTSFest @dawnieando
  14. Classification Modelling Predicting how 'pure' a classifier split in data

    is Purer data to a 'class' gives higher information gain #WTSFest @dawnieando
  15. Information Gain Quantifies the reduction in entropy after splitting the

    data on a particular feature. Higher gain means a more useful split. #WTSFest @dawnieando
  16. Why Could Information Gain Be Misread Patent worship Desire for

    measurable factors #WTSFest @dawnieando
  17. Google on AI Generated Content "Automation has long been used

    to generate helpful content, such as sports scores, weather forecasts, and transcripts" #WTSFest @dawnieando
  18. Thank you X - @dawnieando Bluesky - @dawnieando Linkedin –

    Ms Dawn Anderson Bertey.com #WTSFest @dawnieando