Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Crawl Capacity Management - SEontheBeach 2022

Crawl Capacity Management - SEontheBeach 2022

Gaston Riera

June 18, 2022
Tweet

More Decks by Gaston Riera

Other Decks in Marketing & SEO

Transcript

  1. SOB 2022
    Crawl capacity management
    on Envato Elements
    Gastón Riera - @gastonriera

    View Slide

  2. Heads up..
    Slides in English 󰎉
    Speako en Español 󰎆
    Gastón Riera - @gastonriera

    View Slide

  3. At the end I'll share a 90%
    discount for Elements! ☺
    Gastón Riera - @gastonriera

    View Slide

  4. That's how I used to look,
    all well dressed and all.
    Gastón Riera - @gastonriera
    Gastón Riera

    View Slide

  5. Everything you need to get your
    creative projects done.
    Gastón Riera - @gastonriera
    The big names:
    Other very cool products:

    View Slide

  6. Gastón Riera - @gastonriera
    The two things I like the most about working at envato:
    - Being sustainable and caring about the community
    - Fully remote (ANZ/MX) and working from abroad

    View Slide

  7. Let's get into SEO 🔎
    Gastón Riera - @gastonriera

    View Slide

  8. The problem:
    A big part of the site was
    not being indexed 👎
    Gastón Riera - @gastonriera

    View Slide


  9. Gastón Riera - @gastonriera

    View Slide

  10. Gastón Riera - @gastonriera
    It looked like Google was not recrawling a
    large amount of pages!

    View Slide

  11. Gastón Riera - @gastonriera
    And taking very long to recrawl
    some pages

    View Slide

  12. Was Google actually
    crawling the site?🤔
    Gastón Riera - @gastonriera

    View Slide

  13. Gastón Riera - @gastonriera
    Yes, it was👍

    View Slide

  14. But, it wasn't getting to
    crawl the entire site 😕
    Gastón Riera - @gastonriera

    View Slide

  15. And, why was that?󰤈
    Gastón Riera - @gastonriera

    View Slide

  16. No idea
    Gastón Riera - @gastonriera

    View Slide

  17. Just kidding 😆
    Gastón Riera - @gastonriera

    View Slide

  18. We came up with two
    theories to work on 😎
    Gastón Riera - @gastonriera

    View Slide

  19. - Content quality
    - Internal linking
    Gastón Riera - @gastonriera
    We needed to work on

    View Slide

  20. Gastón Riera - @gastonriera
    We needed to work on
    I'll get to them in a bit.
    - Content quality
    - Internal linking

    View Slide

  21. What did we do? 🤔
    Gastón Riera - @gastonriera

    View Slide

  22. The basics!
    Gastón Riera - @gastonriera

    View Slide

  23. The basics!
    - Noindex
    Gastón Riera - @gastonriera

    View Slide

  24. The basics!
    - Noindex
    - Redirects
    Gastón Riera - @gastonriera

    View Slide

  25. The basics!
    - Noindex
    - Redirects
    - Nofollow
    Gastón Riera - @gastonriera

    View Slide

  26. The basics!
    - Noindex
    - Redirects
    - Nofollow
    - Crawl paths (more/less)
    Gastón Riera - @gastonriera

    View Slide

  27. Gastón, that's nothing new!
    😡
    Gastón Riera - @gastonriera

    View Slide

  28. I know! 😉
    Gastón Riera - @gastonriera

    View Slide

  29. How are we using them?
    Let's get to it
    Gastón Riera - @gastonriera

    View Slide

  30. Battle_1:
    Content quality
    Gastón Riera - @gastonriera

    View Slide

  31. Content is not just text on the
    page, but everything on it.
    Every page is content.
    Gastón Riera - @gastonriera

    View Slide

  32. Battle_1: Content quality
    Gastón Riera - @gastonriera
    Two options:
    1. Add content focussing on quality over
    quantity.
    2. Remove content from Google's index.
    We already had +9M items!

    View Slide

  33. Battle_1: Content quality
    Gastón Riera - @gastonriera
    Two options:
    1. Add content focussing on quality over
    quantity. ❌
    2. Remove content from Google's index.
    We already had +9M items!

    View Slide

  34. Battle_1: Content quality
    Gastón Riera - @gastonriera
    Two options:
    1. Add content focussing on quality over
    quantity. ❌
    2. Remove content from Google's index. ✅
    We already had +9M items!

    View Slide

  35. Do you know what reduces the
    content quality of any site?
    Gastón Riera - @gastonriera

    View Slide

  36. Do you know what reduces the
    content quality of any site?
    DUPLICATE CONTENT!
    Gastón Riera - @gastonriera

    View Slide

  37. Noindex and remove duplicates,
    RUTHLESSLY
    Gastón Riera - @gastonriera
    Noindex a good part of
    the items library.
    -> Several million less
    discoverable pages!
    Why we decided to
    noindex instead of
    a fancier solution?
    Ask me later
    😉

    View Slide

  38. A few tips on how to get what to noindex?
    - Use google's crawled not indexed as a proxy
    - Check duplicate titles/urls/content description
    - Just a different image doesn't make it a different page to
    the eyes of Google!
    Gastón Riera - @gastonriera
    Battle_1: Content quality

    View Slide

  39. Noindex and remove duplicates,
    RUTHLESSLY
    Gastón Riera - @gastonriera
    Why the redirected path
    had 15% of site's traffic
    and 20x the destination.
    Ask me later
    😉
    Merged two translations that ended up being way
    more similar that intended
    -> A few millions pages removed from Google.

    View Slide

  40. Other big things we did
    ● Turned Tag pages into Search pages
    ● Search pages are noindex by default
    The overall result? Decreased the index size to a half
    without impacting organic traffic.
    Gastón Riera - @gastonriera
    50%
    Battle_1: Content quality

    View Slide

  41. Battle_2
    Internal linking
    Gastón Riera - @gastonriera
    Reference

    View Slide

  42. Battle_2: Internal linking
    Gastón Riera - @gastonriera
    Out of many tactics:
    1. Reduce the number of crawl paths
    2. Nofollow on links to low-value pages

    View Slide

  43. Basically,
    Be intentional and smart
    about crawl paths.
    Gastón Riera - @gastonriera

    View Slide

  44. Gastón Riera - @gastonriera
    Link to only valuable pages
    Added links between related
    search pages
    10% Organic traffic!
    If it's a useful search page,
    it will not have a noindex.
    Note that

    View Slide

  45. Gastón Riera - @gastonriera
    Link to only valuable pages
    Remove hreflang when you're uncertain
    of the quality on other languages
    15% size of index!
    hreflang are bidirectional,
    remove them on every
    language.
    Remember 😉

    View Slide

  46. What are valuable pages?
    Gastón Riera - @gastonriera
    In short, pages we want Google to index.

    View Slide

  47. Gastón Riera - @gastonriera
    Link to only valuable pages
    As per nofollow:
    ● Nofollow on links to noindex pages
    ● Filters and facets, all nofollow
    The overall result? Google re-crawled more pages.
    60%

    View Slide

  48. So, why crawl capacity
    management?
    Gastón Riera - @gastonriera

    View Slide

  49. Crawl budget stayed the
    same.
    Gastón Riera - @gastonriera
    *On average, over the last 2yrs.

    View Slide

  50. BONUS TRACK
    and unpopular opinion.
    Gastón Riera - @gastonriera

    View Slide

  51. BONUS TRACK
    and unpopular opinion.
    Gastón Riera - @gastonriera

    View Slide

  52. BONUS TRACK
    and unpopular opinion.
    We learnt that
    ● Sitemaps didn't help indexing AT ALL
    󰤃
    ● Helpful only for debugging 🤓
    Gastón Riera - @gastonriera

    View Slide

  53. 1st month 1 USD🥳
    Go to SOB22.com
    Gastón Riera - @gastonriera
    Yeah no kidding, I did
    register that domain to
    share the discount ☺

    View Slide

  54. Gastón Riera - @gastonriera
    Gracias! - Thank You!

    View Slide