Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advanced SEO Tracking Framework for Zero-Click ...

Advanced SEO Tracking Framework for Zero-Click and AI Search

In this talk, I address the blind spot in the AI search space and propose a framework to map AI-friendly search queries by combining your first-party data from Google Search Console, Bing, and server logs. The objective of the session is to provide a robust and exhaustive tracking framework to capture AI search signals while laying the groundwork for prompt building

Avatar for Athens SEO

Athens SEO PRO

May 27, 2026

More Decks by Athens SEO

Other Decks in Marketing & SEO

Transcript

  1. Advanced SEO Tracking Framework for Zero-Click and AI Search Simone

    De Palma Technical SEO Manager, TUI Keynote
  2. 01 The Blind Spot in Modern Analytics 02 Upgrade to

    Data Warehouse 03 The Fallacy of Prompt Tracking 04 Zero-Click Tracking Framework for AI Search 05 Mapping Bot Signals to User Behaviour with Server Logs 06 Strategic Applications 07 Key Takeaways 08 Appendix
  3. Betrayed by your own SEO Report? Vicious AI Scrapers🦠 Yes,

    it swayed your CTR and Avg.Position too
  4. Data Warehousing THE WHAT Your proprietary search data answer the

    question: “What did the user search for?”
  5. Server Logs THE HOW Your server data answer the question:

    “How did user-agents access that information?”
  6. Search Data Warehousing Server Logs THE WHAT THE HOW Your

    proprietary search data answer the question: “What did the user search for?” Your proprietary server data answer the question: “How did user-agents access that information?” Blind Spot
  7. RAG queries may surface only via API, not within the

    BWT interface • 92% of ChatGPT agents use the Bing API for current results • Bing reveals queries Google anonymizes ✅ Store RAG queries from Bing Search
  8. Although Grounding queries… Are not User Searches ❌ 🙏Cheers Dawn

    Anderson Grounding queries are the best “guess” of an LLM during RAG to validate and synthesize my AI response Are the synthetic retrieval queries of the model ✅
  9. AI citations are sourced from Microsoft Copilot and AI-generated summaries

    in Bing with no distinction Not to mention… ❌
  10. Prompt Tracking measures a blank spot ❌Misses the Hidden RAG

    ❌Ignores the randomness of each model
  11. Synthetic Signals Actual User Queries User location Model Temperature Stochastic

    sampling Past conversations Back to First-Party Data Behavioural Signals Biases and heuristics LLMs fine-tune models based on degrees of randomness at each pre-trained model release, stochastic sampling and grounding Map your search queries prior query fan-out and measure brand visibility across LLMs for your most representative topics Model releases Grounding
  12. 1 2 3 4 SQL Logic & RegEx Dataform &

    JavaScript BigQuery (data warehouse) Google Search Console (Native) + Bing API (GCP Cloud Scheduler) Technical Requirements This is to move past Bing & GSC interface limits (16 months data)
  13. Clean Output Raw Input String Map Accents (e.g; ‘'é' to

    'e’) Lowercase Conversion Data Pre-Processing Remove Non-ASCII RegEx Filter: Branded Terms A JavaScript User-Defined Function (UDF) to normalise inputs This is to focus on non-branded, high-intent queries …And reducing processing overhead in Big Query.
  14. GSC Dataform Mart Filter: Total Clicks <=10 (Zero-Click Signal) Filter:

    Avg Position < 3 (High Authority) Filter: Word Count 10-20 (Complex Questions) Filter: Is_anonymized_query = False Raw GSC Data Stream Result: Queries where users saw the answer (Position < 3)but were unlikely to visit (Clicks <=10)
  15. Bing API Dataform Mart Result: Queries where users saw the

    answer (Position < 3) but were unlikely to visit (Clicks <=10) Normalized Bing Data Apply UDF Pre-processing Bing_webmaster.bing_query_page Schedule Daily API flow Google Cloud Scheduler Aggregate Page-level to Query-level stats GetPageStats and GetQueryStats Build a Script to get Bing API data More chances to find LLM RAG queries than doughnuts
  16. 💡Use Regex to identify and remove synthetic modifiers before querying

    your merged dataset AI Tracker Alert Not All Long Queries are Real Searches Tracking tools alter queries by pre-setting country/language modifiers to your submitted query for prompt tracking
  17. Funnel Intent Mapping ^(what|what is|who|why|when|how does|how do|how much|how long|explain|guide|provide a)\b.*

    ^(show me|find me|help me find|best|list|review|ratings|top|leading|recommended|alternatives to|competitors of)\b.* ^(give me|step by step|buy|purchase|pricing|price|cost|deal|discount|cheap|last minute|checkout|book me|book|check)\b.* User wants an answer Information Stage User Intent Metric User is comparing options User is ready to convert SUM(Impressions) SUM(Impressions) SUM(Impressions) Transactional Stage Consideration Stage
  18. Bias Pre-processing introduces confirmation bias Sample Size First-party data is

    always a sample, never the universe Personas Prompts derived from queries don’t always map to user personas Yet that doesn’t Cover the Blind Spot
  19. • OpenAI’s on-demand user-agent — ChatGPT-User visits your site only

    when a real user’s question requires information from your page. • Direct intent signal — triggered exclusively by prompts inside ChatGPT, giving you a true reflection of user demand. • JavaScript-agnostic — it reads raw HTML only, ignoring client-side rendering. Let’s prove it! ChatGPT-User
  20. Pages with More JavaScript Text were Less Retrieved by ChatGPT-User

    Log Hits Cluster But Also: 1. Content is not Self-Contained 2. Navigational Search Intent
  21. Pages with Less JavaScript Text were More Retrieved by ChatGPT-User

    Log Hits Cluster But Also: 1. Content is Self-contained 2. Informational Search Intent
  22. 💡All you need is not Accuracy, but a Good Sample

    01. Log Access Access via Web Server or CDN (e.g., Akamai) and a data analytics solution (e.g; Grafana, Big Query). 02. Data & Scripting Basic Python & SQL capabilities (LLMs can generate most of the code) 03. Verification Use DNS Checker to run IP lookups. Essential to filter out bot impersonators. Technical Requirements
  23. Fetch Clear ChatGPT-User Signal 1 2 3 6 All Server

    Traffic (7-Day Sample) Filter: HTTP status 200/304 Filter: Content-Type: text/html 4 Filter: UA “ChatGPT-User” 5 Filter: cliIp (Verified IP) Clean AI retrieval dataset
  24. Get the Query Uhm, ChatGPT send repeated hits from the

    same IP. I need to group repeated hits into single events
  25. Aim for 15–30 clusters depending on site size Normalise URL

    Paths into Queries & HDBSCAN Clustering The number of clusters depends on how many rows exported
  26. From Clustered URL Paths to Synthetic Queries Input: URL Cluster

    holidays/weather/europe/italy holidays/weather/europe/greece holidays/weather/europe/spain HDBSCAN Clustering holidays/weather/europe/greece = 1 holidays/weather/europe/greece = 2 holidays/weather/europe/spain = 3 Natural Language Processing One Hot Encoding 010101 -> Word Output: Semantic Label Greek Islands Holidays
  27. Explore New Content Angles & FAQs Users ask long queries

    AI Mode/AI Overview (Zero-Clicks) Users have follow-up questions Your content may surface in the hidden fan-out Create detailed FAQs • Identify questions in the "Information" bucket • Feed AI snippets and provide depth that they can’t match • Focus on the question the user asks after the zero-click answer If users asking long-form questions and not clicking, the Al snippet is likely answering them
  28. Share "Consideration" stage queries with PR teams. Goal: Guide earned

    media coverage to ensure the brand is associated with these topics in the training data. Zero-click searches rely on brand mentions Feed zero-click queries to the paid team. Goal: Use these terms for high-intent targeting to re-engage users who consumed the answer but didn't convert. If organic isn't getting the click, a paid advert might Integrate with Digital PR & Paid Media
  29. Knowledge Protection Improve SEO reporting crossing behavioural search data with

    AI bots access signal Tactic Internal team (SEO, Paid media) Target Prompt building and tracking by funnel stages to intercept gaps in ChatGPT retrieval Tactic Clients, company stakeholders and C-Suite Target Offensive Actions
  30. Prompt Building 2 weeks in Mexico all inclusive from uk

    with flight =CONCAT() Book 2 weeks in Mexico all inclusive from uk with flight
  31. • Export GSC+ Bing raw data to BigQuery • Filter

    for zero-click queries (clicks <10) and focus on long-form queries (10-20 chs.) • Use Bing data as you best proxy for LLM RAG queries intent Zero-Click AI Search Tracking Framework Server Logs Mapping of AI Bot Signals Tracking AI Visibility Requires Mapping Search Behaviour to AI Bots Signal… 💡Bypass GSC’s 16 month cap & profit from anonymised events • Isolate ChatGPT-User user-agent in your logs • Run Reverse DNS lookup to filter out bot impersonators • Less JavaScript = more LLM retrieval – optimise for raw HTML delivery 💡Raw, unsampled signal of how AI systems interact with your site
  32. For informational queries where the AI provides the answer, explore

    new content angles and assess opportunity for detailed FAQs Content Enhancements Share "Consideration" stage gaps with PR teams to ensure your brand is mentioned in the earned media that AI models use as training data If organic content is losing clicks to AI overviews, feed high-intent terms to Paid Media to re-engage users with targeted ads Offensive Digital PR Partner with Paid Search …to serve an Omnichannel Strategy
  33. • Blake, C. and Scharf, A. (2025) 87% of SearchGPT

    citations match Bing’s top results. SEER Interactive. Available at: https://www.seerinteractive.com/insights/87-percent-of-searchgpt-citations-match-bings-top-results • Chang, T. (2026). Bad Bot Report 2026: The Internet Is No Longer Human and It’s Changing How Business Works. Imperva. Available at: https://www.imperva.com/blog/bad-bot-report-2026-bots-agentic-age/ • Scholz, J. (2025) When AI agents do the shopping: Insights from 100 conversations with ChatGPT Agent mode. Search Engine Land, 8 October. Available at: https://searchengineland.com/insights-chatgpt-agent-mode-463127 • OpenAI (2025) Bots — OpenAI platform documentation. Available at: https://platform.openai.com/docs/bots • Remy, R. (2025) The Query Fan-Out Session: Server-side Query Fan-Out Tracking. Conversem. Available at: https://conversem.com/the-query-fan-out-session/ • Eijkemans, R. (2024) Minimal (but powerful) keyword preprocessing in BigQuery. Eikhart. Available at: https://eikhart.com/possibly-useful-article-about/keyword-preprocessing-in-bigquery • Giordano, M. (2024) BigQuery for GSC + GA4 data: why use it & how it differs. Seotistics. Available at: https://seotistics.com/bigquery-gsc-ga4-differences/ • De Palma, S. (2026) Advanced SEO tracking framework: Zero-click AI visibility. SEO Depths. Available at: https://seodepths.com/seo-research/advanced-seo-tracking-framework-zero-click-ai-visibility/ • De Palma, S. (2026) ChatGPT server logs & AI search tracking. SEO Depths. Available at: https://seodepths.com/seo-research/chatgpt-server-logs-ai-search-tracking/