Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

Rachel Anderson - The Secret to Enterprise Cra...

Rachel Anderson - The Secret to Enterprise Crawl Optimization: SRE

Avatar for Tech SEO Connect

Tech SEO Connect PRO

December 12, 2025
Tweet

More Decks by Tech SEO Connect

Other Decks in Marketing & SEO

Transcript

  1. Rachel Anderson Sr. SEO Manager @ Weedmaps The Secret to

    Enterprise Crawl Optimization: SRE December 4-5 2025 Durham, NC
  2. Rachel Anderson Sr. SEO Manager @ Weedmaps The Secret to

    Enterprise Crawl Optimization: SRE December 4-5 2025 Durham, NC with help from: The Royal Court of Princess Donut
  3. Princess Donut “You will not be lobbing balls at me,

    Carl. My word. Do I look like a cocker spaniel to you?”
  4. Technical SEO Fundamentals • Crawl • Render • Index If

    any of these processes break down, your content isn’t going to get organic traffic
  5. Technical SEO Fundamentals in the Age of GEO/AIO Crawl Are

    you blocking Princess Donut? Render Princess Donut doesn’t read JS? She’s a cat! Index Princess Donut has knowledge straight in her mind; she doesn’t need a traditional index
  6. Technical SEO Fundamentals in the Age of GEO/AIO Crawl is

    SUPER important • What are Google/Bing crawling? What are AI bots crawling? • Is it the most important content? • Are they getting 200s? • How is our response time?
  7. Crawl Stats - the SEO’s only native crawl tool •

    Introduced in 2020 • Shows crawl stats from the past 90ish days ◦ Total requests ◦ Download size ◦ Response time Some Cons • Only available for domain properties and only segmentable by subdomain • Only gives example URLs • Only crawl for googlebot Is this data accurate???
  8. Can I trust Crawl Stats? • Crawl stats is great

    for trends and helpful for small domains • The larger your domain, the less accurate
  9. But Rachel, my site is HUUUUGE • Where do you

    go for accurate crawl information for large websites? LOG FILES!!!
  10. Getting Log Files - This is typically a GIANT PITA

    ◦ Is the engineering team actually storing the information you need? ▪ Protocol ▪ www/non-www ▪ response time ◦ Are they storing it for a time frame that is actionable? ◦ Can your computer actually handle the file size?
  11. Cloud Crawlers with Log File Integration? Pros • Your computer

    doesn’t crash when you try to look at log data • You have 3rd party support communicating log file specs • Visualizations that merge log and technical crawl data Cons • Must overcome the same “are we storing the right data” issues • Log files say something isn’t being crawled?!? ◦ Is this down because something broke ◦ Is this down because the ingestion pipeline failed? • Engineers don’t want to spend time troubleshooting your SEO tool • You’re limited to that tool’s visualizations and product development • $$$$$
  12. The best way to get log info as an in-house

    SEO Work with your Site Reliability Engineering team
  13. Mordecai “What’s the point of having me as a manager

    if you two suicidal idiots don’t listen to my advice?"
  14. Working with SRE Who are these people? • You’ve probably

    asked them to whitelist an IP address • Reached out when GSC reported server errors • Gotten a request to not run a large crawl on Black Friday Why? • Their whole job is keeping the website up, secure, and fast • They really care about and understand crawl • They use logs all day! • They use advanced log tools that your org is already paying for and maintaining
  15. Step 1: Meet with your SRE team • Ask nicely

    for a meeting to chat with them about crawl ◦ Frame them as the crawl experts ◦ You’d like to learn: ▪ What resources search engines and AI bots are crawling ▪ What log file tools the org is already using
  16. The Dungeon Anarchist’s Cookbook “Hello, Crawler. As you’re about to

    find, this is a very special book. If you’re reading these words, it means this book has found its way into your hands for one purpose and one purpose only. Together, we will burn it all to the ground.”
  17. Step 2: Educate yourself & form a plan • Read

    up on documentation about the tool ◦ Common tools: Datadog, SumoLogic, Splunk • Review existing dashboards • Define what information you need in a dashboard ◦ Which bots? ◦ How long? ◦ Which segmentations?
  18. Step 3: Build a dashboard! • Asked our SRE team

    to build a dashboard with specific criteria • Refined request with a staff engineer • Better dashboard; cheaper storage
  19. Step 3: Build a dashboard! Which User Agents to track

    • Search Engine bots: googlebot, bingbot (possibly others) • ChatGPT bots: gptbot, chatgpt-user, oai-searchbot • Perplexity bots: perplixitybot, perplexity-user • Claude bots: claudebot, claude-searchbot, claude-user • Other: ccbot (common crawl) Check out https://dejan.ai/blog/ai-bots/
  20. Step 4: Build automated monitoring alerts Example alerts • Googlebot

    hit more than 50% 500s in the past 5 minutes • Google got a 404 when requesting the robots.txt file • The llm.txt was crawled!
  21. When are Google/Bing/AI bots hitting 500s or 400s? This is

    extremely valuable: 1. A 3rd party security tool kept blocking googlebot - SRE terminated that contract 2. Oops, blocking all AI bots!
  22. Other items you may want to track with log file

    monitoring dashboards • Spikes in crawl rate • Spikes in page load time • Http error spikes for “good bots” • How soon after publishing is content getting crawled?
  23. The System AI “New Achievement! Total, Utter Failure. You failed

    a quest less than five minutes after you received it. Now that's talent. Reward: Ha!” Purpose: Run the earth dungeon crawl. The AI is given the discretionary ability to award certain types of achievements and Loot Boxes up to Platinum status. Often refers to itself as “daddy”
  24. Recap • Log files are your best source of crawl

    data • Getting log files is hard, especially if you want AI crawler data • Work with SRE for the best results
  25. Work with SRE to build log monitoring dashboards 1. Meet

    with your SRE team a. Position them as the experts b. Figure out what tools your org is already using 2. Educate yourself a. On the capabilities of the tools b. On what you actually need monitoring for 3. Build dashboards in your engineering tools a. Track traditional user agents + AI user agents b. Partner with SRE for the best, cheapest data 4. Build automated alerting a. For critical errors or tests
  26. NEW ACHIEVEMENT! You’ve created a log file monitoring program with

    SRE! Your rewaaaard: You’ve created cross-functional competency on SEO within the SRE team! The SRE team will now involve SEO as a stakeholder in crawl decisions!