• Create a .md version of each page and optionally reference it in llms.txt. • Today: No major AI crawler reliably consumes llms.txt (GPTBot, ClaudeBot, and PerplexityBot largely ignore it). • Net: Possible future value, but no measurable benefit today. ❌ 1. Alternate URL Markdown variants
2. Realtime format switching using User-agent detection • Cloudflare Workers pattern: detect GPTBot/ClaudeBot/Perplexity via UA and serve a Markdown version of the page. • HTML → Markdown conversion is sometimes done on the fly (using Turndown or Workers AI). • Letʼs take a look at this one! 👉
• LLMs love clean text without UI bloat • Stripping out noise should allow the LLMs to understand hierarchy, entities and the important information from long text. • When building RAG systems itʼs always better to clean and prepare context before ingestion, so maybe it makes sense to do this for AI Search agents?
controlled tests show Markdown pages increase citations in GPT/Claude/Perplexity. No AI crawler has stated a preference for Markdown. Googleʼs John Mueller explicitly says models parse HTML fine and special LLM formats donʼt help. Community tests show AI agents continue to cite the HTML versions even when Markdown equivalents exist.
Faster Iteration • For high-stakes product pages, HTML updates may require long running approvals • Edge-served Markdown allows “fast patches” without touching the core site. • Weʼve heard companies tell us it takes 3 months to change a core product page
Light content expansion • Some teams add slightly more explicit content for AI agents only • Safe only if content is equivalent in meaning to the HTML page.
risk 2. User confusion risk 3. Content parity matters • Safe only if Googlebot always sees the HTML and Markdown is fully noindex + disallowed. • If the Markdown becomes indexable or shows different claims → cloaking. • Giving AI agents richer content than humans access creates: ◦ UX mismatches ◦ Trust issues ◦ A future-enforcement risk as AI platforms evolve anti-manipulation rules • Expanded AI-only FAQ sections are dangerous unless that same info exists somewhere accessible to users. • AI answers that cite non-existent HTML content lead to churn and lost trust.
Markdown-only AI versions Put the content you want AIs to use directly into the core HTML page. • Clean HTML • Strong top-of-page summaries • Clear lists, steps, FAQs • Entity clarity and E-E-A-T signals Best Practice
list patterns • Minimal styling and layout bloat • Easier extraction for embeddings and RAG pipelines • Cleaner diffs for refresh workflows • More maintainable across large content libraries LLMs and AI search engines reward content that is clean, structured, and predictable. Markdown advantages:
• Anything designed to be frequently updated or programmatically reused Ideal use cases: • Heavy design pages (landing pages, product marketing) • Interactive content requiring rich components Where to avoid Markdown:
exist. • LLM-formatted pages donʼt get special treatment • Fan out queries in SERP return regular pages today • File type doesnʼt matter: .txt, .md, .html all rank the same • “Hey AI, use this” messages donʼt work… • Citations come from match quality Why not?
standardized vocabulary (JSON-LD, microdata) that tells machines what your page means, not just what it says. • “This page is a Product; here is its price, brand, rating.” • “This content is an Article; hereʼs the author, date, and headline.” • “These lines are Questions and Answers.” Examples
content • Populate rich results (stars, FAQs, how-to steps, product cards, etc.). (Google has said “structured data wonʼt make your site rank better” – it mainly powers features). Search engines explicitly say they use structured data to: Schema doesnʼt move rankings directly
extracts billions of Schema.org triples (Organization, Product, FAQ Page, etc.) directly from Common Crawl and publishes them as training corpora. • LLM-based tools prefer structured, unambiguous content, so schema gives them a stronger signal that “this is the product / this is the answer / these are the steps.” Structured data is a dense factual layer many LLMs see during training:
(+ Offer, AggregateRating) Describes your company/site: name, logo, URLs, social profiles, site search. Core entity signal. Helps Google/Bing/LLMs understand who you are and connect you across the web. ⭐Article / BlogPosting (+ Person author) FAQPage / QAPage HowTo Describes products/SaaS/apps, including price, availability, ratings, etc. Critical for ecommerce/SaaS. Drives product rich results and feeds AI commerce experiences; strong CTR evidence. (Google for Developers) Marks content as an article with headline, date, author, publisher. Helps systems understand topical content + who wrote it (E-E-A-T). Good for news, blogs, thought leadership. (Google for Developers) Describes products/SaaS/apps, including price, availability, ratings, etc. Once huge for rich snippets, now heavily restricted in Google SERPs; still useful as machine-readable Q&A for AI answers. (SearchPilot) Step-by-step instructions for tasks. Very aligned with how AI and rich results present procedures (steps, tools, time). Great for docs/help content. (Google for Developers) ⭐Review / AggregateRating Star ratings and review information for products/services/content. Strong CTR driver when stars show; very persuasive in both classic SERPs and AI-commerce contexts. (SearchPilot) Schema Type What it is Notes
cluster-analyze server logs to see which pages AI bots hit most frequently, then prioritize those for updates. • Works similarly to prioritizing Googlebot crawl patterns. • Export logs → cluster by page to see AI bot frequency → treat those URLs as high-value AEO pages. • AirOps natively supports this in beta and can help with refresh workflows! What to do:
should do it: • LLMs consistently extract short, declarative answer chunks (FAQs, TL;DRs, numbered lists, definitions). • SGE / AIO citations often pull from well-structured fragments instead of full paragraphs. • Publishers that add “Key takeaway boxes” see higher inclusion rate in AI answers.
assistants return 404 URLs ~3× more often than Google • ChatGPT specifically returns 1.01% hallucinated URLs • And 2.38% of the URLs it cites are dead Our goal: • Detect these hallucinated URLs. • Turn them into real pages (or redirect intelligently). These hallucinated URLs = free intent signals. Source: ahrefs.com
experiences, how we measure the funnel changes. At a minimum you need to measure: • How often your brand is mentioned (answer share of voice) for questions customers are actually asking. • Estimated query volume for those questions. • Your average position within the answer. • The sentiment and quality of the responses. One query can produce different answers every time. Personalization makes this even harder.
on top of Google. New KPIs to add to your dashboard: • Brand Visibility Score in LLM answers • Share of Voice across AI agents • Citation Rate and Citation Delta • Sentiment Score of AI summaries • Prompt-level visibility decay or lift
inside LLMs. Fan-out ecosystem strategy: • Publish insights in multiple formats (video, long-form, short-form, structured) • Seed content into communities (Reddit, StackOverflow, LinkedIn) • Expand off-site mentions through partners, PR, podcasts, YouTube • Ensure internal pages cross-link by topic cluster • Maintain freshness so LLMs prefer your content as “latest available” The more surfaces your content appears on, the more authoritative you look to models. Goal: Create a spiderweb of signals that models repeatedly encounter when answering questions in your category.
for knowledge outside of the base corpus of the LLM, they reach for fresh info with fan-out queries • One prompt = multiple queries • Each retrieval creates new citation/influence opportunities • Show up in the ecosystem of the topic • LLMs reward brands that show up consistently across multiple sources
(sometimes). 1 Takeaways Yes, go deeper with schema (but prioritize). 3 Yes, analyze server logs for AI bots. 4 Yes, use answer blocks. 5 Yes, capture accidental traffic in LLM URL hallucinations. 6
SearchAction) ⭐Product / SoftwareApplication / WebApplication (+ Offer, AggregateRating) Describes your company/site: name, logo, URLs, social profiles, site search. Core entity signal. Helps Google/Bing/LLMs understand who you are and connect you across the web. ⭐Article / BlogPosting (+ Person author) FAQPage / QAPage HowTo Describes products/SaaS/apps, including price, availability, ratings, etc. Critical for ecommerce/SaaS. Drives product rich results and feeds AI commerce experiences; strong CTR evidence. (Google for Developers) Marks content as an article with headline, date, author, publisher. Helps systems understand topical content + who wrote it (E-E-A-T). Good for news, blogs, thought leadership. (Google for Developers) Describes products/SaaS/apps, including price, availability, ratings, etc. Critical for ecommerce/SaaS. Drives product rich results and feeds AI commerce experiences; strong CTR evidence. (Google for Developers) Step-by-step instructions for tasks. Very aligned with how AI and rich results present procedures (steps, tools, time). Great for docs/help content. (Google for Developers) ⭐Review / AggregateRating Star ratings and review information for products/services/content. Strong CTR driver when stars show; very persuasive in both classic SERPs and AI-commerce contexts. (SearchPilot)
should do it: • LLMs consistently extract short, declarative answer chunks (FAQs, TL;DRs, numbered lists, definitions). • SGE / AIO citations often pull from well-structured fragments instead of full paragraphs. • Publishers that add “Key takeaway boxes” see higher inclusion rate in AI answers. What to do: • Add first-paragraph definitions, FAQ blocks, Key Facts sections, Step-by-step blocks. • Make sure answers are truly information-grain content - team interviews, unique grounding etc • Use <h2> + clean lists.
marketing funnel is compressed. • ChatGPT usage is compounding while Google is roughly flat LLM Based Discovery Keeps Climbing 8B 7B 6B 5B 4B 3B 2B 1B 0 US UK ChatGPT visits YOY Last 12 Months Previous 12 Months +94% +131% Data Source: Kevin Indig & SimilarWeb
by Freshness Window Google Chat GPT 91 3.4 3.2 • ChatGPT prompts are 10× longer than Google queries • Even simple informational queries are richer. • AI compresses the user journey. • For marketers, intent is revealed in real time.
in 2026. • Sam Altman contrasted ChatGPTʼs model with that of Google Search, saying Googleʼs ad model relies on search results being imperfect (“when search fails”), and that ChatGPT should aim to earn trust rather than monetize by degrading result quality. • Fidji Simo (OpenAIʼs Applications head) commented that before ads are considered, the commerce experience must be “fantastic” - she emphasized the importance of user experience and data-use concerns. • The implication of this is that ads will be less likely to be pay for placement and more likely to be enhanced listings. This increases the importance of organic strategy as it is the baseline of influence.
of response influence with more on the way. • Shopping feed submission - merchants can submit rich feeds. • Apps SDK - rich experiences that pull more of the owned web experience into the response feed. Submissions expected to open in the next 90 days. • These become important new optimization priorities. ◦ Shopping - information density is critical (Q+A pairs) ◦ Apps SDK - experiment for now if youʼre consumer
being read and/or cited by AI Search • This will vary by topic (question cluster) and also by stage of the funnel Identify right questions Understand influence footprint • There are evergreen questions that represent common customer intents you should be monitoring • There are also net new questions that you should be listening for where your real customers are actually talking • Ask questions frequently and diversely to represent geo/persona/probabilistic diversity
for all types of questions • What weʼre seeing: ◦ Reddit is preferred by Perplexity & ChatGPT ◦ Linkedin is favored by Google AI mode ◦ Quora is growing quickest within AI Overviews Communities play a big role in building visibility
on feature and use-case content to answer product-fit questions. Build deep pages that spell out features, use cases, and who each is for. • Treat these as AEO workhorses: they win follow-up questions no one else covers. Answer every “does it do X?” question on-site
SEO • Buyers constantly ask, “Does this work with [tool]?” • Create integration landing pages and blog posts for every meaningful pairing. • These pages win ultra-specific, long-tail queries that AI search loves.
citations • Create videos for your top landing pages • Drive distribution within YouTube & video platforms like Vimeo • This drives distribution in LLMs, AI Overviews > creates a loop
Split into test (intervene) vs. control (no changes). • Track share of voice before & after. • A win = test goes up, control stays flat. • Tie gains to signups, demos, and pipeline. • Tag query groups in AirOps to measure delta Experiment & Measure
Invest in frontier knowledge and original POVs. • Build structured, scalable content systems. • Know the questions they want to own and track them. • Avoid AI spam. Prioritize humans in the loop. Key Tactics Download Now.
AI Search? Get 1-on-1 help from the AirOps team to apply BoFu frameworks (and more) to your own site. Whatʼs included? • Custom Insights to see where youʼre winning/losing • List of top onsite & offsite opportunities • Full playbook Webflow used to 5x AI signups Claim 1 of 20 Exclusive AI Search Packages
ChatGPT converted at 6.9% vs. Googleʼs 5.4%. • Why higher intent? Users click out after a full multi-turn conversation, more information. • Google traffic is diluted. Navigational searches and ads lower average conversion rates Quality > Quantity 8B 7B 6B 5B 4B 3B 2B 1B 0 US UK ChatGPT visits YOY Last 12 Months Previous 12 Months +94% +131% Data Source: Kevin Indig & SimilarWeb
SearchAction) ⭐ Describes your company/site: name, logo, URLs, social profiles, site search. Core entity signal. Helps Google/Bing/LLMs understand who you are and connect you across the web. Product / SoftwareApplication / WebApplication (+ Offer, AggregateRating) ⭐ Describes products/SaaS/apps, including price, availability, ratings, etc. Critical for ecommerce/SaaS. Drives product rich results and feeds AI commerce experiences; strong CTR evidence. (Google for Developers) Article / BlogPosting (+ Person author) ⭐ Marks content as an article with headline, date, author, publisher. Helps systems understand topical content + who wrote it (E-E-A-T). Good for news, blogs, thought leadership. (Google for Developers) Review / AggregateRating ⭐ Star ratings and review information for products/services/content. Strong CTR driver when stars show; very persuasive in both classic SERPs and AI-commerce contexts. (SearchPilot) FAQPage / QAPage Encodes Q&A pairs on a page. Once huge for rich snippets, now heavily restricted in Google SERPs; still useful as machine-readable Q&A for AI answers. (SearchPilot) HowTo Step-by-step instructions for tasks. Very aligned with how AI and rich results present procedures (steps, tools, time). Great for docs/help content. (Google for Developers)