Slide 1

Slide 1 text

Cracking The Code Secrets Of Technical SEO By Rad Paluszak

Slide 2

Slide 2 text

The Search Initiative Rad Paluszak: Cracking The Code Contents • Machine Learning From The SEO Perspective • Hummingbird • RankBrain • Natural Language Processing • Algorithmic Content Analysis • Link Assessment • Super-Technical Goodies ☺ • Ontological Page Structure • Log Analysis • How to Recover From The “Medic Update”? I N T R O D U C T I O N

Slide 3

Slide 3 text

The Search Initiative Rad Paluszak: Cracking The Code 3 Rad Paluszak • “SEO” birthday - 2010 (Caffeine update) • Web developer “at heart” • Algorithms <3 • Machine Learning <3 • Data Mining <3 • “Technical SEO Artist” 📧 [email protected] D i r e c t o r o f S E O

Slide 4

Slide 4 text

The Search Initiative Rad Paluszak: Cracking The Code 4 Machine Learning Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 5

Slide 5 text

The Search Initiative Rad Paluszak: Cracking The Code 5 Deep Learning Deep learning is part of a broader family of machine learning methods based on learning data representations (e.g. ontology), as opposed to task- specific algorithms. A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers. The DNN finds the correct mathematical manipulation to turn the input into the output, whether it be a linear relationship or a non-linear relationship. The network moves through the layers calculating the probability of each output. M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e By Glosser.ca [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], from Wikimedia Commons https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg

Slide 6

Slide 6 text

The Search Initiative Rad Paluszak: Cracking The Code 6 Hummingbird Codename given to a significant algorithm change in Google Search in 2013. Its name was derived from the speed and accuracy of the hummingbird M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e Conversational Search Natural Language Processing Query Intent Semantic Model Analysis New Ranking Signals Importance of Authority Long-tail Focused (RLY?)

Slide 7

Slide 7 text

The Search Initiative Rad Paluszak: Cracking The Code 7

Slide 8

Slide 8 text

The Search Initiative Rad Paluszak: Cracking The Code 8

Slide 9

Slide 9 text

The Search Initiative Rad Paluszak: Cracking The Code 9

Slide 10

Slide 10 text

The Search Initiative Rad Paluszak: Cracking The Code 10

Slide 11

Slide 11 text

The Search Initiative Rad Paluszak: Cracking The Code Get. Challenge Google. Assistant. to understand…. Ahrefs CEO

Slide 12

Slide 12 text

The Search Initiative Rad Paluszak: Cracking The Code 12

Slide 13

Slide 13 text

The Search Initiative Rad Paluszak: Cracking The Code 13

Slide 14

Slide 14 text

The Search Initiative Rad Paluszak: Cracking The Code 14

Slide 15

Slide 15 text

The Search Initiative Rad Paluszak: Cracking The Code 15 RankBrain RankBrain is an algorithm learning artificial intelligence system, the use of which was confirmed by Google on 26 October 2015. It helps Google to process search results and provide more relevant search results for users. M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries or keywords.

Slide 16

Slide 16 text

The Search Initiative Rad Paluszak: Cracking The Code 16 RankBrain Understands similarity of the queries based on multidimensional vector space analysis & the proximity of one query to the other. M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 17

Slide 17 text

The Search Initiative Rad Paluszak: Cracking The Code 17 Results Post process ing Relevance Matching Intent Analysis NLP Query Parsing User Query Results Satisfaction Analysis (CTR, BR) User-context-based search engine https://patents.google.com/patent/US9449105B1/en System and method for providing search query refinements https://patents.google.com/patent/US20050055341A1/en Predicting Site Quality https://patents.google.com/patent/US9767157B2/en

Slide 18

Slide 18 text

The Search Initiative Rad Paluszak: Cracking The Code 18 Language Use Analysis Improved NLP & Understanding Sentiment & Relevance Learning Intent Distinction Engagement Prediction Trend Awareness Purpose Discovery Intelligent Classification Deep Learning

Slide 19

Slide 19 text

The Search Initiative Rad Paluszak: Cracking The Code 19 Intent Interpretation M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 20

Slide 20 text

The Search Initiative Rad Paluszak: Cracking The Code 20 Query Matching Purpose Instant Answer Latent Intent Interpretation Semantic Structure Freshness Engagement Related Topics User Intent Content Depth Keyword Verticals

Slide 21

Slide 21 text

The Search Initiative Rad Paluszak: Cracking The Code 21 Natural Language Processing M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e Natural Language Processing Machine Translation Information Retrieval Sentiment Analysis Information Extraction Question Answering

Slide 22

Slide 22 text

The Search Initiative Rad Paluszak: Cracking The Code 22 Natural Language Processing M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e CC BY-SA 3.0, https://en.wikipedia.org/w/index.php?curid=11556338 Apache OpenNLP • S Simple declarative phrase • NP Noun phrase • VP Verb phrase • DT Determiner • JJ Adjective • NN Noun, singular or mass • VBZ Verb, 3rd person singular present • VBN Verb, past participle • TO to http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.9.8216&rep=rep1&type=pdf

Slide 23

Slide 23 text

The Search Initiative Rad Paluszak: Cracking The Code 23 Algorithmic Content Analysis M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e https://natural-language-understanding-demo.ng.bluemix.net/

Slide 24

Slide 24 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Google is assessing the quality of your content, UX and purpose. (Remember Page Layout Algorithm?)”

Slide 25

Slide 25 text

The Search Initiative Rad Paluszak: Cracking The Code https://patents.google.com/patent/US20070033168A1/en Huge Banner at The Top Of The Page Desktop Notifications Prompt

Slide 26

Slide 26 text

The Search Initiative Rad Paluszak: Cracking The Code 26 TF-IDF, WDF*IDF, WTF Mathematically, TF-IDF is the product of how often a keyword appears on a page (TF) and how often it is expected to appear on an average web page, based on a larger set of documents (IDF). WDF*IDF (Within Document Frequency*Inverse Document Frequency) - an analysis method to determine keywords and terms that sustainably increase the relevance of published texts. The result is the relative term frequency and weighting of a document, relative to all other web documents including analysed keyword. WTF – I have no idea what I’m doing ☺ M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e SEO PowerSuite

Slide 27

Slide 27 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Simple TF-IDF tweaks can work very well for your 80/20 rule!”

Slide 28

Slide 28 text

The Search Initiative Rad Paluszak: Cracking The Code 28 TF-IDF, WDF*IDF, WTF M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e BEFORE AFTER

Slide 29

Slide 29 text

The Search Initiative Rad Paluszak: Cracking The Code 29 TF-IDF, WDF*IDF, WTF M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 30

Slide 30 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Google always tries to predict your site structure and assess what’s worth crawling and what is not.”

Slide 31

Slide 31 text

The Search Initiative Rad Paluszak: Cracking The Code 31 ML In Crawling & Crawl Budget Management M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 32

Slide 32 text

The Search Initiative Rad Paluszak: Cracking The Code 32 ML In Crawling & Crawl Budget Management M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e I didn’t learn f*ck-all about the site structure. Googlebot Image Credits: http://www.thesempost.com/blocking-googlebot-with-bad-bot-scripts-wordfence/ https://www.seroundtable.com/google-crawl-report-problem-19894.html Scheduler for search engine crawler https://patents.google.com/patent/US7725452B1/en

Slide 33

Slide 33 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Manual Penalty process starts with a pre-selection of suspicious candidates detected by machine learning algorithms.”

Slide 34

Slide 34 text

The Search Initiative Rad Paluszak: Cracking The Code 34 ML In Link Assessment M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e Manual Penalty Manual Verificati on Classific ation Pre- Screenin g Machine Learning

Slide 35

Slide 35 text

The Search Initiative Rad Paluszak: Cracking The Code 35 ML In Link Assessment M a c h i n e L e a r n i n g F r o m T h e S E O P e r s p e c t i v e

Slide 36

Slide 36 text

Chaos Theory • Branch of mathematics that deals with systems that appear to be orderly but, in fact, harbor chaotic behaviors. • Chaos theory applies to complex, dynamic systems highly sensitive to initial conditions. • The deterministic nature of these systems does not make them predictable. • Chaos paradoxically leads to formal structure and order. 1,653 Launches 9,800 live traffic experiments 18,015 side- by-side experiments 130,336 Search quality tests

Slide 37

Slide 37 text

The Search Initiative Rad Paluszak: Cracking The Code 37 Ontology A model for describing the world that consists of a set of types, properties, and relationship types. There is also generally an expectation that the features of the model in an ontology should closely resemble the real world (related to the object). • Relational Classification • Logical Categorisation • Semantic Domain AKA Universe Ontological Page Structure S U P E R - T E C H N I C A L G O O D I E S https://blog.grakn.ai/what-is-an-ontology-c5baac4a2f6c

Slide 38

Slide 38 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “RankBrain uses all sorts of ontologies (multidimensional vector spaces) to learn the Web. = Google MUST understand your site structure.”

Slide 39

Slide 39 text

Flat, dead, WTF? Not Good ☺ . 0 1 Visible clusters, some core pages hard to crawl. Looks OK, but… . 0 2 All (trust me) core pages within 1 click from … EVERYWHERE. Great structure. The “hanging” pages are only sponsored posts. Looks weird, but… . 0 3 Yes ☺ Looks Good (?) . 0 4 Ontological Page Structure S U P E R - T E C H N I C A L G O O D I E S

Slide 40

Slide 40 text

The Search Initiative Rad Paluszak: Cracking The Code (OBVIOUS) KNOWLEDGE BOMB “The easiest way to show Google your site’s ontological nature is to use silo-based structure. DO NOT SPAM IN THE URLs!!!!”

Slide 41

Slide 41 text

Month-on-month comparison once the whole process finished Increased MOM Traffic Google activity in terms of monthly events logged. Improved Crawlability +11.52% +8.7% Silo Structure Results • In this case we used Canonicals, not 301s to limit fluctuations (In Google We Trust…) • ~60 pages - entire cluster size • 6 days to index 8 core pages under new URLs • ~2 weeks to transfer the 8 core pages to the new URLs • ~5.5 weeks to move the whole cluster over • ~3% traffic fluctuation in the meantime S U P E R - T E C H N I C A L G O O D I E S

Slide 42

Slide 42 text

The Search Initiative Rad Paluszak: Cracking The Code 42 Bad Examples • /business/business-management/ • /air-sprayers/handheld-air-sprayers/ • /air-sprayers/handheld-air-sprayers/graco- 16Y385-paint-sprayer/ • /trecks/machu-picchu-trecks/ausangate-trek/ • /ebike-reviews/best-electric-bikes/ebikes-under- 1000/brompton-electric-review/ S U P E R - T E C H N I C A L G O O D I E S • /business/management/ • /air-sprayers/handheld/ • /air-sprayers/handheld/graco-16Y385-review/ • /trecks/machu-picchu/ausangate-trail/ • /reviews/under-1000/brompton-electric/ Good Examples

Slide 43

Slide 43 text

The Search Initiative Rad Paluszak: Cracking The Code 43 Server Log Analysis Screaming Frog Log Analyser is your friend ☺ • Understanding Crawl Budget • Confirming Devaluation • Finding undiscoverable issues • Analysing Google Crawl Habits S U P E R - T E C H N I C A L G O O D I E S Googlebot Image Credits: http://www.thesempost.com/blocking-googlebot-with-bad-bot-scripts-wordfence/ https://www.seroundtable.com/google-crawl-report-problem-19894.html

Slide 44

Slide 44 text

The Search Initiative Rad Paluszak: Cracking The Code 44  Healthy Site Devalued Site → Server Log Analysis S U P E R - T E C H N I C A L G O O D I E S Screaming Frog Log Analyser is your friend ☺ • Understanding Crawl Budget • Confirming Devaluation • Finding undiscoverable issues • Analysing Google Crawl Habits

Slide 45

Slide 45 text

The Search Initiative Rad Paluszak: Cracking The Code 45 Server Log Analysis Screaming Frog Log Analyser is your friend ☺ • Understanding Crawl Budget • Confirming Devaluation • Finding undiscoverable issues • Analysing Google Crawl Habits S U P E R - T E C H N I C A L G O O D I E S Direct URL 11% Query String 89%

Slide 46

Slide 46 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Google was still crawling URLs from a version of the site redesigned 4 years ago.”

Slide 47

Slide 47 text

The Search Initiative Rad Paluszak: Cracking The Code 47 Server Log Analysis Screaming Frog Log Analyser is your friend ☺ • Understanding Crawl Budget • Confirming Devaluation • Finding undiscoverable issues • Analysing Google Crawl Habits S U P E R - T E C H N I C A L G O O D I E S Googlebot Image Credits: http://www.thesempost.com/blocking-googlebot-with-bad-bot-scripts-wordfence/ https://www.seroundtable.com/google-crawl-report-problem-19894.html

Slide 48

Slide 48 text

The Search Initiative Rad Paluszak: Cracking The Code 48 Server Log Analysis Screaming Frog Log Analyser is your friend ☺ • Understanding Crawl Budget • Confirming Devaluation • Finding undiscoverable issues • Analysing Google Crawl Habits S U P E R - T E C H N I C A L G O O D I E S

Slide 49

Slide 49 text

The Search Initiative Rad Paluszak: Cracking The Code “Medic” Case Study #1 Patient: Ambitious Affiliate Site Issue: -794 positions down Diagnosis: Algorithmic Content Devaluation Recovery Time: From 31.07-08.08 to 22.09-08.10 Final Result: +1,097 positions up S U P E R - T E C H N I C A L G O O D I E S

Slide 50

Slide 50 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMB “Write CONCISE content where user intent requires short answers! Do not blindly go by 1k+ words per page.”

Slide 51

Slide 51 text

The Search Initiative Rad Paluszak: Cracking The Code “Medic” Case Study #2 Patient: Whitelabel Adventure Travel Site Issue: -187 positions down (mainly top 5) Diagnosis: Content (NOT KEYWORD!) Cannibalisation Recovery Time: From 31.07-11.08 to 28.09-07.10 Final Result: +382 positions up S U P E R - T E C H N I C A L G O O D I E S

Slide 52

Slide 52 text

The Search Initiative Rad Paluszak: Cracking The Code “Medic” Case Study #2 Patient: Whitelabel Adventure Travel Site Issue: -187 positions down (mainly top 5) Diagnosis: Content (NOT KEYWORD!) Cannibalisation Recovery Time: From 31.07-11.08 to 28.09-07.10 Final Result: +382 positions up S U P E R - T E C H N I C A L G O O D I E S

Slide 53

Slide 53 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMBS 1. “Ontological Content Structure is just as important as the site structure.” 2. “Content Cannibalisation is usually more dangerous than keyword cannibalisation.”

Slide 54

Slide 54 text

The Search Initiative Rad Paluszak: Cracking The Code Ranking in post “Medic” world. S U P E R - T E C H N I C A L G O O D I E S 1. Build Personas that are not easy to track down. Why the f*ck would you call it Jimmy Hendrix? 2. Ensure good, tree-like navigation. Menu (drop-downs, mega-menu, etc.) is your friend. 3. E-A-T is not just author bio and social media. Google also needs to “confirm” whatever you’re saying in external sources! 4. Do not write fluff content. Google understands the context, intent, etc. 5. Don’t stuff ads and affiliate links in every possible whole. Medic and following updates are strictly focused on quality measures. 6. Plan for latent intent. Cover your back whenever possible. 7. Avoid content cannibalisation. No need to have the same sh*t in every article. 8. Get rid of unnecessary pages. I’ve recently ranked more sites removing than creating content. 9. Look at your index management. Housekeeping is always a good thing! 10. Save crawl budget. Google ain’t gonna waste it on yo stinky site!

Slide 55

Slide 55 text

The Search Initiative Rad Paluszak: Cracking The Code KNOWLEDGE BOMBS “With SICK onsite foundations you need less links.”

Slide 56

Slide 56 text

The Search Initiative Rad Paluszak: Cracking The Code

Slide 57

Slide 57 text

https://www.youtube.com/watch?v=QXf95_EKS6E

Slide 58

Slide 58 text

The Search Initiative Rad Paluszak: Cracking The Code Matt Diggity Founder & Head Of Insights Rafid Nassir Commercial Director Will Bagnall Co-founder & CEO Rad Paluszak Director of SEO

Slide 59

Slide 59 text

No content