Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Technical SEO in e-Commerce - Search Y 2021

Technical SEO in e-Commerce - Search Y 2021

My talk from Search Y - Technical SEO 2021 titled Technical SEO in e-Commerce.

Are you managing SEO for an online shop or a retail offering ? You are selling goods online or run your very own web shop ? Bastian is sharing his top tips and best practices from 100+ technical SEO audits with the sole focus on making your e-commerce platform perform better in organic search. In his SEO session Bastian is talking about unique, e-commerce related SEO challenges such as how to handle product detail pages, deal with out-of-stock situations, multi-category trees, large scale indexing scenarios, faceted navigation, sorting/filtering issues and much, much more.

Bastian Grimm

November 07, 2022
Tweet

More Decks by Bastian Grimm

Other Decks in Technology

Transcript

  1. Technical SEO in E-Commerce How to successfully master the biggest

    technical SEO challenges in online shopping/e-commerce Bastian Grimm, Peak Ace AG | @basgr
  2. pa.ag @peakaceag 4 No two shops are the same… Differences

    in industry and size require customised SEO strategies * Brand = e.g. Uber (not a priority to sell on the site) vs. eComm = e.g. Nike (online shop) or Emirates (ticket shop) Type of domain Number of URLs (scope) eCommerce Publishing Classifieds Lead-gen Brand Other <1,000 <10,000 <100,000 <1,000,000 1,000,000+ Which quadrant are you in?
  3. pa.ag @peakaceag 5 No two shops are the same… Online

    retailers with limited product ranges (<1,000 products) face different challenges than multi-range retailers * Brand = e.g. Uber (not a priority to sell on the site) vs. eComm = e.g. Nike (online shop) or Emirates (ticket shop) Type of domain Number of URLs (scope) eCommerce Single retailer Multi retailer … Publishing Special interest (e.g. health) Daily newspaper … Classifieds Lead-gen Brand Other <1,000 <10,000 <100,000 <1,000,000 1,000,000+ Which quadrant are you in?
  4. pa.ag @peakaceag 7 #1 Indexing strategy: categories, sub-categories, pagination, etc.

    Caused by/refers to: All types of overview/listing pages Issue brief: Categories that compete with subcategories or super deep paginations that cause crawl and indexing problems Issue categories: Crawling inefficiencies, website quality Suggested change/fix: Crawl/indexing strategy dependent on size/page types Comment: Loads of variables to consider; the larger the site gets, the more complex to get right
  5. pa.ag @peakaceag 8 No crawling and/or indexing strategy Depending on

    the age, scope and volume, there can be lots of URLs to deal with; carefully consider what you want to give to Googlebot:
  6. pa.ag @peakaceag 9 Google released it‘s own guide to managing

    crawl budget Opposite to what Google is saying, this is very well worth a read for everyone – even though its specifically tailored to “large” as well as “very rapidly changing sites”: Source: https://pa.ag/35MqZHX
  7. pa.ag @peakaceag 10 Getting Google to crawl those important URLs

    Significantly cuts to the crawlable URL inventory led to an intended shift; Google started crawling previously uncrawled URLs after eliminating 15m+ unnecessary URLs Source: Peak Ace AG 0 5.000.000 10.000.000 15.000.000 20.000.000 25.000.000 30.000.000 35.000.000 Jul Aug Sep Oct Nov Dec Jan Feb crawled URLs un-crawled URLs total URLs
  8. pa.ag @peakaceag 11 Keyword targeting: main categories vs sub-categories Which

    (sub)category should be found for the term “fresh fruit"? Pay close attention to clear terminology and differentiation:
  9. pa.ag @peakaceag 12 Tons of unnecessary/unused sorting and/or filtering If

    you have sorting options, ensure they're being used (analytics is your friend) – otherwise remove them and prevent them from being crawled (robots.txt / PRG)
  10. pa.ag @peakaceag 13 The “articles per page” filter/selection: don’t bother

    For each category listing, three times the number of URLs are generated – this is a crawling disaster. And often, if left unchecked, this leads to duplicate content: Client-side, JavaScript would at least solve crawling and indexing problems - but it is questionable whether this feature is actually being used.
  11. pa.ag @peakaceag 14 Pagination (for large websites) is essential! Recommendation

    for the "correct" pagination (for each objective) from audisto: Source: https://pa.ag/3cjlgev For lists with equally important items, choose the logarithmic or the Ghostblock pagination (equal PR across all pagination and item pages). For lists with a small number of important items use the “Link first Pages”, “Neighbors”, “Fixed Block” pagination (most PR goes to the first pagination and item pages).
  12. pa.ag @peakaceag 16 Hang on - what about "noindex=nofollow"? Noindexing

    pages should lead to nofollow - at least over time - as well Source: https://pa.ag/2EssNeV Google's John Mueller says that long term, noindex follow will eventually equate to a noindex, nofollow directive as well […] eventually Google will stop going to the page because of the noindex, remove it from the index, and thus not be able to follow the links on that page.
  13. pa.ag @peakaceag 17 Links from noindex‘ed pages might be worthless

    “Noindex = don’t index. And if we completely drop it […], then we wouldn’t use anything from there […] I wouldn’t count on links from noindex pages being used.” Source: https://pa.ag/2TCiADY
  14. pa.ag @peakaceag 19 #2 Content quality: thin & duplicate product

    pages Caused by/refers to: Product Detail Pages Issue brief: Automatically delivered product data or products with hardly any (differentiation) features quickly create (near) duplicate content. Issue categories: Website quality, duplicate content, thin content Suggested change/fix: Monitor content quality carefully, e.g. define noindex rules accordingly
  15. pa.ag @peakaceag 20 Common causes of duplicate content #1 For

    Google, these examples are each two different URLs: Dealing with duplication issues ▪ 301 redirect: e.g. non-www vs. www, HTTP vs. HTTPs, casing (upper/lower), trailing slashes, Index pages (index.php) ▪ noindex: e.g. white labelling, internal search result pages, work-in-progress content, PPC- and other landing pages ▪ (self-referencing) canonicals: e.g. for parameters used for tracking, session IDs, printer friendly version, PDF to HTML, etc. ▪ 403 password protect: e.g. staging-/development servers ▪ 404/410 gone: e.g. feeded content that needs to go fast, other outdated/irrelevant or low-quality content i https://pa.ag https://www.pa.ag non-www vs www http://pa.ag https://pa.ag HTTP vs HTTPS https://www.pa.ag/cars?colour=black&type=racing https://www.pa.ag/cars?type=racing&colour=black URL GET-parameter order
  16. pa.ag @peakaceag 21 Common causes of duplicate content #2 For

    Google, these examples are each two different URLs: Taxonomy issues Production server vs. https://pa.ag/url-a/ https://pa.ag/url-A/ Case sensitivity https://pa.ag/url-b https://pa.ag/url-b/ Trailing Slashes Category A Category B Category C Staging / testing server
  17. pa.ag @peakaceag 22 Content quality is still important in ecommerce!

    Keep a close eye on the indexing of articles that are "very similar" in content: About 2.410 results (0,37 seconds)
  18. pa.ag @peakaceag 23 Prevent inferior content from being indexed In

    particular: automatically generated URLs like "no ratings for X" or "no comments for Y" often lead to lower quality content! Other kinds of bad or thin content: ▪ Content from (external) feeds (e.g. through white label solutions / partnerships, affiliate feeds etc.) ▪ Various "no results" pages (no comments for product A, no ratings for product B, no comments for article C etc.) ▪ Badly written content (e.g. grammatical errors) ▪ General: same content on different domains
  19. pa.ag @peakaceag 24 #3 Handling multiple versions of a product

    (colour/size) Caused by/refers to: Product detail pages Issue brief: Demand is too low for the PDPs being indexed (in all their combinations). Link equity/ranking potential is lost. Issue categories: Duplicate content, crawl inefficiency, ranking issues Suggested change/fix: Consolidate a default product whenever possible (e.g. strongest selling colour/size) Comment: Client-side JS or, at a minimum, canonical tags are needed
  20. pa.ag @peakaceag 25 Exactly the same GEL-NIMBUS 22 in a

    different colour Asics uses individual URLs for each of their available colour/size variations Not enough people search for “asics Gel Nimbus 22 black 43” and “grey” respectively. Demand is too low for the PDPs being indexed (in all their combinations). Link equity/ranking potential is lost/split. i gel-nimbus-22/p/1011A680-002.html gel-nimbus-22/p/1011A680-022.html 022.html
  21. pa.ag @peakaceag 26 One solution could be to canonicalise to

    a root product: A canonical tag is only a hint, not a directive. Google can choose to ignore it entirely. When using canonical tags, please be extra careful: ▪ There may only be one rel-canonical annotation per URL - only ONE! ▪ Use absolute URLs with protocols and subdomains ▪ Rel-canonical targets must actually work (no 4XX targets) – they need to serve a HTTP 200 ▪ No canonical tag chaining, Google will ignore this! ▪ Maintain consistency: only one protocol (HTTP vs. HTTPS), either www or non-www and consistent use of trailing slashes ▪ Etc.
  22. pa.ag @peakaceag 28 Most efficient: minimising URL overhead Improve crawl

    budget (and link equity) by consolidating to one URL. Salmon PDPs are rewarded with strong rankings: #848=15694 #848=15692
  23. pa.ag @peakaceag 29 #4 One product, but reachable via multiple

    categories Caused by/refers to: Product detail pages Issue brief: Product detail pages should be reachable via multiple URLs (due to the category name being part of the PDP URL) Issue categories: Duplicate content, crawl inefficiencies, ranking issues Suggested change/fix: Category-independent product URLs Comment: Alternatively, define a default category to be used in the URL slug
  24. pa.ag @peakaceag 30 Two different URLs serving the exact same

    product This minimises the chances of it ranking well; also from a crawling perspective, this isn‘t a good solution at all – both URLs would be crawled individually. international-gins most-popular
  25. pa.ag @peakaceag 31 Solution: only ever use one URL per

    product! A dedicated product directory - regardless of the category - is the best solution in most cases; it also often makes analysis easier: Alternative: Consolidate all products within the document root Watch out: Using canonical tags or noindex for products with multiple results is possible - but inefficient in terms of crawling. reduction > noindex > canonical tag. !
  26. pa.ag @peakaceag 32 #5 Brand filter vs. branded category: /watches/breitling

    vs. /breitling/all Caused by/refers to: Category pages and their filters Issue brief: A brand category that targets the exact same keyword set vs a category that allows filtering for a brand name Issue categories: Keyword cannibalisation, crawl inefficiency Suggested change/fix: Canonicalise, prevent indexation of one URL variant Comment: PRG pattern for large-scale scenarios (e.g. preventing an entire block of filtering from being crawled/indexed)
  27. pa.ag @peakaceag 33 Another classic: brand filter vs brand (category)

    page If you index both, which one is supposed to rank for the generic branded term? One keyword, one URL: try to minimise internal competition as much as you can. Two (or more) pages targeting "Breitling watches" make it unnecessarily hard for Google to select the best result! i Category ”watches“ filtered by brand ”Breitling“ Dedicated “Breitling” showcase/brand page
  28. pa.ag @peakaceag 34 #6 Expired/(temp.) out of stock product management

    Caused by/refers to: Product detail pages Issue brief: PDPs for products that are (temporarily) out of stock can cause bad engagement metrics (e.g. high bounce rates, etc.) Issue categories: Engagement metrics, website quality, inefficient URLs Suggested change/fix: Implement OOS strategy (redirects, info layer, disable ((410) entirely, etc.) Comment: Hugely complex topic depending on the size, the volatility of the inventory, and much more
  29. pa.ag @peakaceag 35 Deal with your out of stock items

    - but not like M&S does! Are they just temporarily unavailable (and for how long) or will they never come back? Also, what alternative versions are available? About 294.000 results (0,23 seconds) M&S keeps all of their out of stock pages indexed: <meta name="robots" content="index, follow"> <link rel="canonical“ href="[…]/chef-hard-anodised-28cm-saute-pan/p/p22467321"> i
  30. pa.ag @peakaceag 36 How to deal with OOS situations? For

    non-deliverable products, there is not only one solution. Often, it comes down to a combination. Tip: use dynamic infolayer to inform users. OOS-Handling REDIRECT (internal search) REDIRECT (successor) REDIRECT (similar products in other colours, sizes, etc.) 410 ERROR (only if you really want to delete!) REDIRECT (same product but e.g. in a different colour) NOINDEX (newsletter/lead gen)
  31. pa.ag @peakaceag 37 No exit strategy for paginated categories? Categories

    with high churn need to deal with paginated pages coming and going (e.g. what happens when there's not enough products to display a 2nd page?) About 3,065 results (0,28 seconds)
  32. pa.ag @peakaceag 38 #7 Facetted navigation, sorting & filtering (e.g.

    in categories) Caused by/refers to: Category pages that allow for filtering and/or sorting Issue brief: Various sorting/filtering/facets time categories and sub- categories can lead to millions of (worthless) URLs Issue categories: Keyword cannibalisation, crawl inefficiency, thin content Suggested change/fix: Individual indexing strategy (based on demand) per filter and facet, prevent crawling/indexing for sorting Comment: Very difficult to get right, usually requires individual solutions
  33. pa.ag @peakaceag 39 Issue: facetted navigation poorly controlled/implemented “A facetted

    search is a restriction of the selection according to different properties, characteristics and/or values.” If Zalando would allow for all these options to become crawlable URLs, this would lead to millions and millions of useless URLs. Only allow crawling and indexing of URLs that have target keywords and keyword combinations with actual search demand. Pay special attention to internal keyword cannibalisation. i
  34. pa.ag @peakaceag 40 Solution: Boots handles this excellently using client-side

    JS Also, from a user's perspective, using JavaScript for features such as filtering feels much faster, since the perceived load time decreases #facet:-100271105108108101116116101,-1046543&product
  35. pa.ag @peakaceag 43 #8 Structured Data: schema.org Caused by/refers to:

    Product detail pages Issue brief: Google needs machine-readable data in a structured form (basis: schema.org) to display some information directly in the search results, for example price and product availability. If this information is not available, the result preview is smaller (= one line is missing). Issue categories: SERP snippet preview, SERP CTR Suggested change/fix: Implement schema.org markup on product detail pages (min. prices, availability etc.); and ideally ratings too, if available.
  36. pa.ag @peakaceag 44 Rich snippets based on structured data A

    valuable additional line in the SERP Snippet for more attention: schema.org/Rating + AggregateRating schema.org/Product + schema.org/Offers + schema.org/InStock
  37. pa.ag @peakaceag 45 Label products and offers with schema.org Schema.org

    markup for product details as well as price, stock & reviews
  38. pa.ag @peakaceag 46 Google discontinued the SDTT … yeah, I

    know – right? Attention: The Rich Results Test does not show all kinds / types of structured data, but only those that Google supports. Source: https://pa.ag/2DSKpzO ▪ Bing Webmaster Markup Validator https://www.bing.com/toolbox/markup-validator ▪ Yandex Structured Data Validator https://webmaster.yandex.com/tools/microtest/ ▪ ClassySchema Structured Data Viewer: https://classyschema.org/Visualisation ▪ https://schemamarkup.net/ ▪ https://www.schemaapp.com/tools/schema-paths/ ▪ https://json-ld.org/playground/ ▪ https://technicalseo.com/tools/schema-markup-generator/
  39. pa.ag @peakaceag 47 Tip: Free Structured Data Helper from RYTE

    Highlights syntax errors and missing required properties. All nested elements in one place for convenient in-line validation: Source: https://pa.ag/3b9CkU5
  40. pa.ag @peakaceag 48 To avoid confusion: no schema mark-up! Schema.org

    mark-up is not being used to show/generate this extended SERP snippet: So-called featured snippets are usually shown for non-transactional search queries; schema.org mark-up is not mandatory. Also no schema.org mark-up, Google extracts this information (“monthly leasing rate”) directly from the HTML mark-up.
  41. pa.ag @peakaceag 49 #9 Discovery: XML sitemaps, etc. Caused by/refers

    to: Better/faster article indexing Issue brief: Sitemaps and crawl hubs for better internal linking, discovery and additional canonicalisation signals Issue categories: Crawl efficiency, internal linking Suggested change/fix: Establish a proper XML sitemap (creation) process, find the URLs that Google hits heavily and use them to link internally Comment: Poorly maintained XML sitemaps, e.g. containing broken / irrelevant URLs, can lead to significant crawl budget waste
  42. pa.ag @peakaceag 50 Poorly maintained XML sitemaps No redirects, no

    URLs that are blocked via robots.txt or meta robots, no URLs with a different canonical tag! ▪ Screaming Frog ▪ Mode > List ▪ Download XML Sitemap
  43. pa.ag @peakaceag 51 Pages not in sitemaps – using DeepCrawl’s

    source gap Comparing (and overlaying various crawl sources) to identify hidden issues/potentials
  44. pa.ag @peakaceag 52 #10 Web performance: maximum loading speed Caused

    by/refers to: Loading speed (entire website) Issue brief: Often mobile services in particular are still extremely slow, but this is not the only area where there is a need for optimisation. Most online shops are not fully optimised: Pictures, external fonts, JavaScripts and much more offer opportunities for performance gains. Issue categories: Loading speed, engagement metrics Suggested change/fix: Multifaceted topic with a significant number of individual optimisation possibilities, which depend to a large extent on the infrastructure, shop system etc. - can only be solved successfully together with the IT team.
  45. Fast loading time plays an important role in overall user

    experience! Performance is about user experience!
  46. pa.ag @peakaceag 54 Revisited: page speed already is a ranking

    factor Source: http://pa.ag/2iAmA4Y | http://pa.ag/2ERTPYY
  47. pa.ag @peakaceag 55 User experience to become a Google ranking

    factor The current Core Web Vitals set focuses on three aspects of user experience - loading, interactivity, and visual stability - and includes the following metrics: Source: https://pa.ag/3irantb Google announced a new ranking algorithm designed to judge web pages based on how users perceive the experience of interacting with a web page. That means if Google thinks your website users will have a poor experience on your pages, Google may not rank those pages as highly as they are now. i
  48. pa.ag @peakaceag 57 Optimising for Core Web Vitals such as

    LCP, FID and CLS? An overview of the most common issues and respective fixes: LCP is primarily affected by: ▪ Slow server response time ▪ Render blocking JS/CSS ▪ Resource load times ▪ Client-side rendering FID is primarily affected by: ▪ Third-party code ▪ JS execution time ▪ Main thread work/business ▪ Request count & transfer size CLS is primarily affected by: ▪ Images without dimensions ▪ Ads, embeds and iframes without dimensions ▪ Web fonts (FOIT/FOUT) Optimizing for LCP: ▪ Server response times & routing ▪ CDNs, caching & compression ▪ Optimise critical rendering path ▪ Reduce blocking times (CSS, JS, fonts) ▪ Images (format, compression, etc.) ▪ Preloading & pre-rendering ▪ Instant loading based on PRPL Optimising for FID: ▪ Reduce JS execution (defer/async) ▪ Code-split large JS bundles ▪ Break up JS long tasks (>50ms) ▪ Minimise unused polyfills ▪ Use web workers to run JS on a non-critical background thread Optimising for CLS: ▪ Always include size attributes on images, video, iframes, etc. ▪ Reserve required spaces in advance ▪ Reduce dynamic injections
  49. pa.ag @peakaceag 58 Client-side/front-end optimisation tasks Use my checklist on

    SlideShare to double check: All slides on SlideShare: http://pa.ag/iss18speed ▪ Establish a content-first approach: progressive enhancement, also prioritise visible, above the fold content: 14kB (compressed). ▪ Reduce size: implement effective caching and compression. ▪ Whenever possible, use asynchronous requests. ▪ Decrease the size of CSS and JavaScript files (minify). ▪ Lean mark-up: no comments, use inline CSS/JS only where necessary or useful. ▪ Optimise images: reduce overhead for JPGs & PNGs (metadata, etc.), request properly sized images and try new formats. ▪ Minimise browser reflow & repaint.
  50. pa.ag @peakaceag 59 Increasing crawled URLs due to faster load

    times Slashing website load times (Lighthouse score ~36 to 70) led to >25% more URLs being crawled by Googlebot: Source: Peak Ace AG 0 10 20 30 40 50 60 70 80 0 50.000 100.000 150.000 200.000 250.000 300.000 350.000 400.000 Nov Dec Jan Feb Mar Apr crawled URLs Lighthouse perf. score (avg.)
  51. pa.ag @peakaceag 62 Don‘t forget 301s when changing your structure

    If the category name is "automatically" connected to the URL slug, redirect to the "new" name; when deleting a category, always have a redirect in place. The following applies: if an (old) URL was linked externally at some point (and that link still exists), the internal redirect (e.g. old category name > new category name) is now required forever.
  52. pa.ag @peakaceag 63 Bulk test these things: redirects & other

    headers HTTP status codes (errors, redirects, etc.) at scale, for free: httpstatus.io Check it out: https://httpstatus.io/
  53. pa.ag @peakaceag 64 Fix those redirect chains, especially on large

    sites… …as multiple requests waste valuable performance and crawl budget!
  54. pa.ag @peakaceag 65 Don‘t be lazy: ensure code hygiene! Remove

    internally linked redirects from templates and adjust them to “direct“ linking:
  55. pa.ag @peakaceag 66 Also fix (internally) linked error pages (e.g.

    404)! Adjust internal links in the code and check alternative references (canonical, sitemap, etc.); for traffic, ext. links and rankings => redirect. Quality signal!?
  56. What if I have no time to write my own

    titles (or descriptions)?
  57. pa.ag @peakaceag 68 At least try to use simple templates!

    Google usually autogenerates the worst snippets; the same standard fallback page title directly qualifies the affected URLs as duplicates:
  58. pa.ag @peakaceag 70 For tracking, whenever possible use # instead

    of ? Run GA Tracking with fragments instead of GET parameters; or automatically remove parameters with hitCallback parameters (after page view measurement): Source: https://pa.ag/2TuJMk5 If - for whatever reason – you need to use URL parameters, don't forget implementing canonical tags and always test using GSC to ensure that Google actually uses them.
  59. pa.ag @peakaceag 72 Also: do not use your own parameters

    for tracking! Did I mention that parameters actually cause all sorts of problems, constantly?
  60. pa.ag @peakaceag 74 URL parameter settings in Google Search Console

    GSC also allows you to manually configure URL parameters and their effects; please note that this is "only" available for Google.
  61. pa.ag @peakaceag 81 Prevent crawling & indexing: POST-Req. & noindex

    Prevent crawling and indexing of search results. SERP in SERP usually leads to a bad user experience / bad signals - Google sees it the same way: About 663,000 results (0,92 seconds)
  62. pa.ag @peakaceag 83 Personalisation: good or bad - and what

    to consider? Consider variable internal linking, such as "last viewed" or "you might also like this article" in the link graph: Use the non-personalised standard view for the Googlebot; personalisation as "layer on top" is unproblematic from an SEO point of view.
  63. pa.ag @peakaceag 91 Things are much easier now: loading =

    lazy Performance benefits paired with SEO friendliness (and no JS) simultaneously Tip: This now also works for <iframe src=“…“ loading=“lazy“> : Most recent versions of Chrome, Firefox and Edge do support this already: