Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Crawl Capacity Management - Melb SEO Meetup - Nov 2022

Crawl Capacity Management - Melb SEO Meetup - Nov 2022

A brief overview of a few big things we've done on Envato Elements to improve crawl budget, or as I'm calling it here: Crawl Capacity Management.

This slide deck is presented at the Melbourne SEO meetup on Tuesday the 8th of November

Gaston Riera

November 08, 2022
Tweet

More Decks by Gaston Riera

Other Decks in Marketing & SEO

Transcript

  1. That's how I used to look, all well dressed and

    all. Gastón Riera - @gastonriera Gastón Riera
  2. Everything you need to get your creative projects done. Gastón

    Riera - @gastonriera The big names: Other very cool products:
  3. Gastón Riera - @gastonriera When should we care about crawl

    [capacity|budget]? Then there is a shit ton of pages ingested by Google. Here some examples of websites I've worked on
  4. The problem: A big part of the site was not

    being indexed 👎 Gastón Riera - @gastonriera
  5. We came up with two theories to work on 😎

    Gastón Riera - @gastonriera
  6. Gastón Riera - @gastonriera We needed to work on I'll

    get to them in a bit. - Content quality - Internal linking
  7. The basics! - Noindex - Redirects - Nofollow - Crawl

    paths (more/less) Gastón Riera - @gastonriera
  8. Content is not just text on the page, but everything

    on it. Every page is content. Gastón Riera - @gastonriera A key takeaway
  9. Battle_1: Content quality Gastón Riera - @gastonriera Two options: 1.

    Add content focussing on quality over quantity. 2. Remove content from Google's index. We already had +9M items!
  10. Battle_1: Content quality Gastón Riera - @gastonriera Two options: 1.

    Add content focussing on quality over quantity. ❌ 2. Remove content from Google's index. We already had +9M items!
  11. Battle_1: Content quality Gastón Riera - @gastonriera Two options: 1.

    Add content focussing on quality over quantity. ❌ 2. Remove content from Google's index. ✅ We already had +9M items!
  12. Do you know what reduces the content quality of any

    site? Gastón Riera - @gastonriera
  13. Do you know what reduces the content quality of any

    site? DUPLICATE CONTENT! Gastón Riera - @gastonriera
  14. Noindex and remove duplicates, RUTHLESSLY Gastón Riera - @gastonriera Noindex

    a good part of the items library. -> Several million less discoverable pages! Why we decided to noindex instead of a fancier solution? Ask me later 😉
  15. A few tips on how to get what to noindex?

    - Use google's crawled not indexed as a proxy - Check duplicate titles/urls/content description - Just a different image doesn't make it a different page to the eyes of Google! Gastón Riera - @gastonriera Battle_1: Content quality
  16. Noindex and remove duplicates, RUTHLESSLY Gastón Riera - @gastonriera Why

    the redirected path had 15% of site's traffic and 20x the destination. Ask me later 😉 Merged two translations that ended up being way more similar that intended -> A few millions pages removed from Google.
  17. Other big things we did • Turned Tag pages into

    Search pages • Search pages are noindex by default The overall result? Decreased the index size to a half without impacting organic traffic. Gastón Riera - @gastonriera 50% Battle_1: Content quality
  18. Battle_2: Internal linking Gastón Riera - @gastonriera Out of many

    tactics: 1. Reduce the number of crawl paths 2. Nofollow on links to low-value pages
  19. Gastón Riera - @gastonriera Link to only valuable pages Added

    links between related search pages 10% Organic traffic! If it's a useful search page, it will not have a noindex. Note that
  20. Gastón Riera - @gastonriera Link to only valuable pages Remove

    hreflang when you're uncertain of the quality on other languages 15% size of index! hreflang tags are bidirectional, remove them on every language. Remember 😉
  21. Gastón Riera - @gastonriera Link to only valuable pages As

    per nofollow: • Nofollow on links to noindex pages • Filters and facets, all nofollow The overall result? Google re-crawled more pages. 60%
  22. BONUS TRACK and unpopular opinion. We learnt that • Sitemaps

    didn't help indexing AT ALL 󰤃 • Helpful only for debugging 🤓 Gastón Riera - @gastonriera