Pages; Webpages that have a high change frequency and/or are seen as highly authoritative News website homepages & key section pages Highly volatile classified portals (jobs, properties) Large volume ecommerce (Amazon, eBay, Etsy) • Main purpose = discovery of valuable new content; ➢ i.e., news articles • Rarely re-crawls newly discovered URLs; ➢ New URLs can become VIPs over time
Unimportant Pages; URLs that have very little link value and/or are very rarely updated ➢ Recrawls URLs that serve 4XX errors -Likely also occasionally checks old redirects
other pages; use robots.txt to block pages or resources that you don't want Google to crawl at all. Google won't shift this newly available crawl budget to other pages unless Google is already hitting your site's serving limit. https://developers.google.com/search/docs/crawling- indexing/large-site-managing-crawl-budget
Googlebot; ➢ Not just HTML pages ➢ Reduce amount of HTTP requests per page • AdsBot can use up crawl requests; ➢ Double-check your Google Ads campaigns • Link equity (PageRank) impacts crawling; ➢ More link value = more crawling ➢ Elevate key pages to VIPS • Serve correct HTTP status codes; ➢Googlebot will adapt accordingly
HTML • Index selection; ➢ De-duplication prior to indexing • Indexing; ➢ First-pass based on HTML ➢ Potential rendering (not guaranteed) • Index integrity; ➢ Canonicalisation & de-duplication
Pages that need to be served quickly and frequently Includes news articles but also popular content 2. SSD storage; ➢ Pages that are regularly served in SERPs but aren’t super popular 3. HDD storage; ➢ Pages that are rarely (if ever) served in SERPs
Keywords: ➢ Allows Google to understand what your content should rank for 2. Links: ➢ Gets you onto the 1st page of Google 3. Clicks: ➢ Determines whether you stay there (and rise) or drop off
➢ Use keywords in the right places 2. Make this content link-worthy; ➢ And keep making more of it 3. Have a website people like engaging with; ➢ Good UX and all that jazz