My talk "Tomorrow, the world" from SMX Munich 2022 covering 15 tips for taking your SEO international, including content localisation, hreflang implementation and best practices, domain setup, geo targeting options and much more.
e.g., a Swiss business with a German and French website multi-regional e.g., a website that explicitly targets users in different countries … or both e.g., a website might have different versions for DE & CH, and both DE & FR versions of the CH content
audience? You need to give search engines a few hints to let them know which audiences you're catering to. Here are a few pointers to explain this: Credits: https://pa.ag/2CqYwvB Content analysis In which language is the content written – and for which region? e.g., British English is different from American English. TLD hints In using a ccTLD (e.g., .de) search engines assume that you're targeting that country (e.g., Germany). Primary origin of links If a website gets most of its links from a specific ccTLD, it’s likely that it will be associated with the respective market. hreflang annotations Are you using hreflang to communicate which audiences you want to target with which URL? Content‐language attribute Bing, Baidu & others use the content-language HTML attribute for targeting hints instead of hreflang. Google & Bing config Properly configured language and/or geo targeting in GSC as well as geo targeting in Bing WMT. And others This is not a comprehensive list. There are most certainly other signals too. Local/business listings e.g., Google My Business listings and Bing's Places for Business.
intent is the why behind a search query: why did the person make this search? Are they looking for information, to make a purchase, or for a specific website? Informational Navigational Commercial Transactional ▪ "Jason Statham movies" ▪ "Berlin Paris distance" ▪ "what are carbs" ▪ "peak ace address" ▪ "gmail" ▪ "instagram login" ▪ "Dubai winter temperature" ▪ "haircut near me" ▪ "best webinar software" ▪ "Audi rsq8 price" ▪ "champagne next day delivery" ▪ "BER CDG flights"
as well Back in 2007, Microsoft published a patent suggesting that 87% of ambiguous queries can be identified and understood with supervised machine learning: Source: https://pa.ag/2XHdZTt We propose a machine learning model based on search results to identify ambiguous queries. The best classifier achieves accuracy as high as 87%. By applying the classifier, we estimate that about 16% queries are ambiguous in the sampled logs.
haven't figured this out yet: It’s of utmost importance right now to get intent mapping right; intent means relevance and therefore better rankings. Get this wrong, and you have no chance of ranking long term.
their TLDs The results are localised nowadays, independently of the TLD of the Google. This means that the results would be the same on google.de and google.com in Germany. Source: https://pa.ag/2JMk71d […] the choice of country service will no longer be indicated by domain. Instead, by default, you’ll be served the country service that corresponds to your location. So if you live in Australia, you’ll automatically receive the country service for Australia, but when you travel to New Zealand, your results will switch automatically to the country service for New Zealand.
in a different language. Converting content "word for word" is technically already translating. This process strives to make the translated text as true to the original as possible. Key characteristics ▪ The language changes, but the words and the message stay the same. ▪ Language is translated and context is only given in editor's notes. Translation vs. localisation vs. transcreation Let's talk some definitions and characteristics, shall we? Localisation When you take translated content and edit it to reflect the culture of the target language, you are localising the content. This is key to online marketing, as user experience and readability, for example, are important factors that heavily impact the content format and length. Key characteristics ▪ The words change, but the meaning behind those words stays the same. ▪ The language is also adapted in a culture-sensitive way. Transcreation Transcreation is possible when the author of the source text (in language A) works together with the author of the translated/localised text (in language B). Both authors work together to tell a story to obtain the same effect in different markets. Key characteristics ▪ The content changes but the business goals stay the same. ▪ The content is developed anew in the local language.
words… Translation would convert Super Bowl into the German Superschüssel. Transcreation would build a new, country-specific campaign around the ideas of fun and sport. Localisation would contextualise Super Bowl and probably use World Cup instead.
On average, German contents needs 10% to 35% more space than its English counter- part? You need to plan for that – e.g. when working with your design team! English ▪ Relevant tips ▪ Employee satisfaction ▪ Sick note ▪ Financial services company German ▪ Sachdienliche Hinweise ▪ Mitarbeiterzufriedenheit ▪ Arbeitsunfähigkeitsbescheinigung ▪ Finanzdienstleistungsunternehmen
global strategies don't need full adaptation, because keeping the original language adds value for the end-user. Think of all the (unpronounceable) IKEA names. ▪ The world is full of creative examples, especially if we look at major multinational businesses that set the standards of localisation. But the question sometimes is: is localising even worth it? ▪ Sometimes keeping the original language adds value for the end-user, as it adds flair to a product. Just think of all the unpronounceable names at IKEA. ▪ How many memes and jokes have been made around the Swedish company?
But maybe that doesn't really matter? ▪ The English adaptation of the Haribo slogan maintains the effect of the rhyme and the catchy playfulness of the German version. ▪ If translated back (“Kinder und Erwachsene lieben sie, die fröhliche Welt von Haribo”) the new word choice doesn’t convey the original slogan word for word. ▪ So, what makes a message stronger? An exact translation or a localisation of the slogan?
affect images just as much as the written content: An example I use a lot is the cover of FIFA, the video game. On the German cover there are players from German teams (say, Neuer for example) whereas on the Brazilian cover there are players from other teams.” Irene Morcilo San Jose, Peak Ace AG ▪ FIFA adopts a flexible approach where there's always one global game cover alongside the regional ones. ▪ Whether or not a global or regional cover is used in a market depends on internal research. A player like Lionel Messi, featured for 5 consecutive years, is well known in more than one country and doesn't need to be “localised”.
to insult a whole country In 2011, Puma released a global marketing strategy that used the flag colours of each nation to honour diversity. Source: https://pa.ag/3MswEGX Someone at Puma clearly forgot to do their due research, as they showed a significant lack of cultural understanding in the United Arab Emirates. ▪ In UAE, anything associated with feet and the floor is considered dirty. ▪ UAE nationals were therefore not pleased to have their national flag turned into shoes.
three: One could go as far as to categorise potential tasks as follows: Translation Localisation Transcreation Useful for technical content: manuals, safety warnings, legal or medical documents, etc. Useful for global online marketing in general, international SEO, UX elements, visuals etc. Useful for global campaign ideation and concepts, ad copy, global processes, etc. Focus on the source, technical knowledge required Balance between source & target language, technical knowledge with language support required Focus on the target language, deep knowledge of the new market culture & creative brief required Note: This is only a generic categorisation. You should evaluate which skillsets you need on a project-by-project basis.
Source: Deutsche Welle Here's an excerpt from a Deutsche Welle article, published in Autumn 2021 for the German audience. Je kälter es draußen wird, desto mehr halten wir uns normalerweise drinnen auf. Und damit kann sich das Corona-Virus in geschlossenen Räumen wieder besser verbreiten. Laut Robert-Koch-Institut (RKI) war die Sieben-Tage-Inzidenz am Wochenende erstmals wieder dreistellig und stieg zu Wochenbeginn auf 110,1 Neuansteckungen je 100.000 Einwohner. Vor einer Woche hatte die Inzidenz noch 74,4 betragen.”
Translation Localisation Transcreation UK The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily Corona Virus can spread. According to the Robert Koch Institute (RKI) the seven-day incidence was in triple figures this weekend and rose to 110.1 new infections per 100,000 inhabitants at the beginning of the week. Just a week before, the incidence had been 74.4. The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily COVID can spread. This is certainly something that has been seen in Germany. According to the German health authority, the Robert Koch Institute, the seven-day incidence was in triple figures this weekend and rose to 110.1 new infections per 100,000 inhabitants at the beginning of the week. Just a week before, the incidence had been 74.4. In England, however, rates have remained lower at 19.7 in spite of fears of the Delta Variant (or, as it was known in Germany, the British Variant) of the Corona Virus spreading across the channel. The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily COVID can spread. The UK is no exception to this. According to the UK government, the case rate rose to 56,000 new infections at the weekend. Just a week before, cases had been stable at 36,000. A clear increase in cases is here along with the colder weather. US The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily Corona Virus can spread. According to the Robert Koch Institute (RKI) the seven-day incidence was in triple figures this weekend and rose to 110.1 new infections per 100,000 inhabitants at the beginning of the week. Just a week before, the incidence had been 74.4. The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily the Coronavirus can spread. This is certainly something that has been seen in Europe. According to the German health authority, the Robert Koch Institute, the seven-day incidence in Germany was in triple figures this weekend and rose to 110.1 new infections per 100,000 inhabitants at the beginning of the week. Just a week before, the incidence had been 74.4. In the USA, however, rates have remained at the lowest they’ve been since July, with only moderate increases in recent days. The colder it gets outside, the more we usually stay indoors; and the more time we spend indoors, the more easily the Coronavirus can spread. The United States is no exception to this. According to the CDC, the 7 Day Average Moving Case Rate has risen by 10,000 cases since the beginning of November. However, this is still 80,000 less than what we saw at the height of summer 2021. Whether we will reach the same all-time high of 202,000 that we saw in January 2021 this winter therefore remains to be seen. UK: COVID vs. US: Coronavirus UK would talk about the „Delta variant“ as this started to rise throughout Europe, US would talk about general infection rates being low.
two versions of top-level domains: generic TLDs (gTLD) and country code TLDs (ccTLD); ccTLDs have a fixed geo target (i.e., de = Germany). gTLD ccTLD .com .de .net .fr .org .co.uk Here are some of the most common domain setup options from an SEO point of view: ▪ multiple ccTLDs ▪ gTLDs in combination with subdomains ▪ gTLDs in combination with subfolders ▪ gTLDs with a combination of subdomains and subfolders Example: peakace.de + peakace.fr Example: de.peakace.com Example: peakace.com/fr/ Example: es.peakace.com/es-mx/
variants: One strong global gTLD for all languages or separate (cc)TLDs for every country. Variant Example Comments Separate domains for every country (either ccTLDs only, or a mix of both) amazon.com amazon.co.uk amazon.de ▪ Each domain is managed separately ▪ More difficult to establish a brand in new markets (starting from zero) ▪ Strong geotargeting signal for Google (for ccTLDs), GSC possible for gTLDs ▪ Often not possible for every market (domain availability) Multiple subdomains under a single global domain (gTLD) en.wikipedia.org es.wikipedia.org de.wikipedia.org fr.wikipedia.org ▪ Medium effort to set up and manage ▪ Subdomains will be treated separately by Google ▪ No benefit from domain authority ▪ Not great for CTR Subfolders on a single global domain (gTLD) netflix.com/de netflix.com/mx netflix.com/it ▪ Easier to expand to new markets ▪ Massive benefit from domain authority / external links ▪ Depending on CMS, easy to set up and manage ▪ Violating Google's guidelines could impact all versions (very rarely though) URL parameters instagram.com/?hl=en instagram.com/?hl=de instagram.com/?hl=es ▪ Geotargeting in GSC is not possible ▪ Google doesn't recommend it
it!) Keep in mind: Google crawls primarily from IP addresses in the US. Source: https://pa.ag/3foCesW […] The client’s website didn’t have a section for the US, so instead all traffic from the US was redirected to a landing page explaining that their service was not available in the US. So that page was the only page that Google ever saw. […] All the other pages on domain.com were basically invisible to Google. domain.com domain.com/gb domain.com/fr domain.com/de redirect redirect redirect
Language (required): ISO 639-1, country (optional): ISO 3166-1 By using hreflang attributes, you can target all English-speaking users (hreflang=“en”), but of course, if you have a dedicated version for English-speaking people living in Canada, you can target them by using hreflang “en-ca”. i <link rel="alternate" href="https://example.com/ca" hreflang="en-ca"/> An example of hreflang attribute Language required Country Not required
localised versions If you have multiple versions of a page for different languages or regions, let Google know about the variants: ▪ Each language version must list itself as well as other language versions ▪ If two pages don't both point to each other, the tags will be ignored! ▪ Alternate URLs must be fully-qualified ▪ e.g., do not forget the correct protocols (http/https) ▪ Implement a language as well as a country code to target various languages in one country (e.g., “fr-ch” & “de-ch”) ▪ Consider adding an x-default fallback for other unmatched versions ▪ <link rel="alternate" href="http://example.com/" hreflang="x-default" />
Each has its advantages and disadvantages, so it's a matter of choosing what is best for you and your setup: HTML Vulnerable to errors when using many different TLDs Changes made to HTML code can lead to errors in parsing/processing Straightforward implementation in <head> XML sitemap Allows implementation of changes quickly (centralised) Controllable- & CMS- independent setup Initial setup is slightly more complex than other methods - - - + Server header Complex implementation (server config) Non-HTML files (PDFs) can be integrated High maintenance effort (server config) - - + + +
to implement and easy to maintain, provided you don't have too many setup/parings and that they don’t get too complex! Examples: <link rel="alternate" hreflang="de-AT" href="https://www.domain.at/series/example-page.html"/> <link rel="alternate" hreflang="de" href="https://www.domain.de/series/example-page.html"/> <link rel="alternate" hreflang="en-GB" href="https://www.domain.co.uk/series/example-page.html"/> Disadvantages: ▪ Code bloat: We do not recommend implementing hreflang annotations for >10 in HTML <head>. There is a risk that the code will bloat, which is especially bad for slower mobile connections. ▪ Maintenance: Consistently timed updates for each individual website are needed, otherwise hreflang pairings will be broken and can’t work. ▪ A slow recrawling of URLs at a deeper page level often causes pairings to be broken for longer.
can also be implemented using server headers, but it's harder to both monitor and maintain. However, it pays off for non-HTML such as PDFs, etc. Examples: Link: <https://www.domain.at/series/example-page.html>; rel="alternate"; hreflang="de-AT" Link: <https://www.domain.de/series/example-page.html>; rel="alternate"; hreflang="de" Link: <https://www.domain.co.uk/series/example-page.html>; rel="alternate"; hreflang="en-GB" Disadvantages: ▪ It is more difficult to monitor changes and errors, due to annotations being "invisible" to the end user ▪ The implementation is very complex, and the effort is high maintenance, because directives need to be applied on server level, e.g., using Apache .htaccess / nginx conf (which often requires dev ops to be involved)
most practical solution for large-scale setups would be to implement hreflang via XML sitemaps: Example: sitemap AT <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <url> <loc>https://www.domain.at/series/example-page.html</loc> <xhtml:link rel="alternate" hreflang="en-GB" href="https://www.domain.co.uk/series/example-page.html"/> <xhtml:link rel="alternate" hreflang="de-CH" href="https://www.domain.ch/series/example-page.html"/> <xhtml:link rel="alternate" hreflang="de" href="https://www.domain.de/series/example-page.html"/> Advantages of such a setup: ▪ No impact on actual website code/size, meaning there is no impact on web performance ▪ Entirely independent from individual websites, can be setup, run and maintained by a dedicated team ▪ Overall, it is easier to implement and update whilst allowing for faster refresh frequency than individual HTML pages (this means that pairings are usually always intact)
setup No impact on actual web infrastructure: Generating massive sitemaps with millions of URLs won't affect servers responsible for delivering the website: Sitemap server DE robots.txt UK robots.txt FR robots.txt Each TLD robots.txt file points to their sitemap on the sitemap server
gives you everything you need, and at a very reasonable price. This makes it a serious alternative to building a custom solution – I am a huge fan! Source: https://www.hreflangbuilder.com/ Features ▪ Localised URL mapping ▪ Complex URL mapping ▪ XML file hosting ▪ Error checking ▪ Automated updates ▪ Missing page identification ▪ Generating XML sitemaps
The x-default specifies where a user should be sent in the case none of the specified languages in your other hreflang links match the browser settings. Source: https://pa.ag/2W9WGpO Sometimes the x-default has been included by accident and the page is not a suitable fallback for the rest of the world. English is a good language to use as your fallback option, since so many people speak it, whereas German is not always a great choice (or only works for the DACH region), since fewer people speak it globally. i The website has content that targets users around the world as follows: ▪ http://example.com/en-gb: For English-speaking users in the UK ▪ http://example.com/en-us: For English-speaking users in the USA ▪ http://example.com/en-au: For English-speaking users in Australia ▪ http://example.com/: The homepage shows users a country selector and is the default page for users worldwide In this case you can annotate this cluster of pages using HTML like this: <link rel="alternate" href="http://example.com/en-gb" hreflang="en-gb" /> <link rel="alternate" href="http://example.com/en-us" hreflang="en-us" /> <link rel="alternate" href="http://example.com/en-au" hreflang="en-au" /> <link rel="alternate" href="http://example.com/" hreflang="x-default" />
that can go wrong Sitebulb will check these issues against your hreflang implementation: Source: https://pa.ag/3KmVdDi ▪ Has invalid incoming hreflang annotations ▪ Has invalid outgoing hreflang annotations ▪ Has outgoing hreflang annotations to noindex URLs ▪ Noindex URL has incoming hreflang ▪ Has outgoing hreflang annotations to broken URLs ▪ Has outgoing hreflang annotations to canonicalised URLs ▪ Canonicalised URL has incoming hreflang ▪ Has outgoing hreflang annotations to disallowed URLs ▪ Disallowed URL has incoming hreflang ▪ Has conflicting incoming hreflang annotations ▪ Has conflicting outgoing hreflang annotations ▪ Has multiple self-referencing hreflang annotations ▪ Has outgoing hreflang annotation to multiple URLs ▪ Has outgoing hreflang annotations using relative URLs ▪ Invalid HTML lang attribute ▪ Mismatched hreflang and HTML lang declarations ▪ Missing hreflang annotations ▪ Missing reciprocal hreflang (no return-tag) ▪ Has outgoing hreflang annotations to redirecting URLs ▪ Has unsupported or misconfigured hreflang ▪ Has hreflang annotations using multiple methods ▪ Missing canonical URL ▪ Missing HTML lang attribute ▪ Has hreflang annotations without HTML lang ▪ Hreflang annotation also x-default ▪ Missing self-reference hreflang annotation
hreflang attribute What you think it is What it actually is es-pa Spanish in Paraguay Spanish in Panama kr-kr Korean in Korea Kanuri in Korea cz-cz Czech in Czech Republic – cr-cr Croatian in Croatia Cree in Costa Rica EN-IR English in Ireland English in Iraq fr-Mo French in Monaco French in Macau ne-NE Nepali in Nepal Nepali in Niger It doesn't really matter whether you use uppercase or lowercase for the hreflang attribute value. It might make it easier to read – but for Google, it's all the same. i
you what the different country codes should look like. Pay attention to the fact that tags must be implemented reciprocally for each URL. Source: https://www.sistrix.de/hreflang-guide/hreflang-generator/ & https://www.aleydasolis.com/english/international-seo-tools/hreflang-tags-generator/
subdomains Using mobile subdomains means that the hreflang must be implemented in the same way as it is for desktop: reciprocally and 1 to 1. Desktop English Desktop French Desktop German Mobile English Mobile French Mobile German alternate media rel=canonical hreflang hreflang hreflang hreflang hreflang hreflang alternate media rel=canonical alternate media rel=canonical
through “edge servers“ When we ignore DNS, databases etc for a minute, this is what it would look like: First request, ever. peakace.js is not cached on edge server yet Origin server Request: peakace.js Request: peakace.js peakace.js delivered from origin server Response: peakace.js peakace.js gets cached on edge server
through “edge servers“ When we ignore DNS, databases etc for a minute, this is what it would look like: Origin server Request: peakace.js peakace.js delivered from edge server peakace.js is cached on edge server Second request (independent of user)
If something seems off, these are the areas we always check first: You are too impatient: Google needs to (re-) crawl the entire pairing of URLs to understand your implementation Non-existent hreflang values: they need to be in ISO 639-1 for language & (optionally) ISO 3166-1 Alpha 2 for region Irrelevant hreflang mapping: e.g., "de-DE" has been mapped to an English URL Non-existent URLs: broken/mal-formatted URLs, missing protocols, destination returns 4xx/5xx, etc. Only using the country code: you can target a language on its own, but you can't target a country without a language (e.g., hreflang=“GB” doesn’t work!) Canonicalised elsewhere: an hreflang tag pointing to an URL that's canonicalising (or redirecting) elsewhere No-return hreflang tag: missing the actual hreflang tag pointing back to the origin Missing self-referencing hreflang tag: this is often overlooked, especially in larger hreflang groups