Slide 1

Slide 1 text

Getting to Grips with Python & Machine Learning for SEO Ruth Everett // DeepCrawl https://www.slideshare.net/RuthEverett1 @rvtheverett

Slide 2

Slide 2 text

Ruth Everett Technical SEO Analyst @rvtheverett Getting to Grips with Python & Machine Learning for SEO @rvtheverett @DeepCrawl

Slide 3

Slide 3 text

@rvtheverett @deepcrawl #BrightonSEO Allow: /dogs Allow: /SEO Allow: /python My coding partner in crime

Slide 4

Slide 4 text

PROBLEM SEOs are busy @rvtheverett #BrightonSEO

Slide 5

Slide 5 text

SOLUTION Automation #BrightonSEO @rvtheverett

Slide 6

Slide 6 text

@rvtheverett #BrightonSEO Enter Data Analysis & Automation with Python

Slide 7

Slide 7 text

Getting Started with Python What We’ll Cover How Python can help with Technical SEO An Introduction to Machine Learning for SEO @rvtheverett #BrightonSEO

Slide 8

Slide 8 text

#BrightonSEO GETTING STARTED WITH PYTHON @rvtheveret

Slide 9

Slide 9 text

Before @rvtheverett #BrightonSEO

Slide 10

Slide 10 text

Now @rvtheverett #BrightonSEO

Slide 11

Slide 11 text

WHAT IS PYTHON? Code written in the terminal @rvtheverett #BrightonSEO Results generated Open-source interactive programming language Interpreted line by line

Slide 12

Slide 12 text

COMPANIES USING PYTHON @rvtheverett #BrightonSEO

Slide 13

Slide 13 text

COMPANIES USING PYTHON "Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we're looking for more people with skills in this language." @rvtheverett #BrightonSEO

Slide 14

Slide 14 text

COMPANIES USING PYTHON "Python is fast enough for our site and allows us to produce maintainable features in record times, with a minimum of developers" @rvtheverett @BrightonSEO

Slide 15

Slide 15 text

CODECADEMY @rvtheverett #BrightonSEO 20 week online course Mixture of theory and practical A range of projects to undertake Code console & terminal to play and test

Slide 16

Slide 16 text

DATACAMP @rvtheverett #BrightonSEO Wide range of skill tracks Interactive exercises Instant explanations Challenges and projects https://www.datacamp.com/learn/python/

Slide 17

Slide 17 text

SOLOLEARN @rvtheverett #BrightonSEO Free mobile app Learn Python on the go Over 200 practice questions Code Playground https://www.sololearn.com/Course/Python/

Slide 18

Slide 18 text

CODECOMBAT @rvtheverett #BrightonSEO https://codecombat.com/

Slide 19

Slide 19 text

USING PYTHON Mac - Terminal Windows - Command Line @rvtheverett #BrightonSEO

Slide 20

Slide 20 text

USING PYTHON @rvtheverett #BrightonSEO Google Colab

Slide 21

Slide 21 text

USING PYTHON @rvtheverett #BrightonSEO Jupyter Notebook

Slide 22

Slide 22 text

PYTHON LIBRARIES @rvtheverett #BrightonSEO Data extraction & analysis Scientific Computing Natural Language Processing Machine Learning

Slide 23

Slide 23 text

@rvtheverett #BrightonSEO HOW PYTHON CAN HELP WITH TECHNICAL SEO

Slide 24

Slide 24 text

WHY SHOULD WE CARE? @rvtheverett #BrightonSEO Data extraction and analysis to solve complex problems Future-proofing your job Efficiency and time-saving Automating repetitive tasks https://www.ranksense.com/empowering-a-new-generation-of-seos-with-python/

Slide 25

Slide 25 text

WHY SHOULD WE CARE? @rvtheverett #BrightonSEO Spend 5 hours a week using excel

Slide 26

Slide 26 text

WHY SHOULD WE CARE? @rvtheverett #BrightonSEO Spend 5 hours a week using excel Thats 20 hours a month

Slide 27

Slide 27 text

WHY SHOULD WE CARE? @rvtheverett #BrightonSEO Spend 5 hours a week using excel Thats 20 hours a month Over 200 hours a year

Slide 28

Slide 28 text

WHY SHOULD WE CARE? @rvtheverett #BrightonSEO Imagine what we could achieve if we spent this time on other important tasks (that can’t be automated)

Slide 29

Slide 29 text

WHY SHOULD WE CARE? @rvtheverett @DeepCrawl Redirect Relevancy

Slide 30

Slide 30 text

WHY SHOULD WE CARE? @rvtheverett Pivot Tables @DeepCrawl

Slide 31

Slide 31 text

@rvtheverett #BrightonSEO WHY IS PYTHON GROWING IN POPULARITY IN THE SEO SPACE? Make data driven decisions Allowing us to focus on other important optimisation efforts Confidence in recommendations Provide concrete insights Better understand data

Slide 32

Slide 32 text

AUTOMATING WITH PYTHON @rvtheverett #BrightonSEO Automating with Python Parameter Finder 404 Checker Internal Linking Analysis Image Optimisation Website Scraping Keyword Research

Slide 33

Slide 33 text

@rvtheverett #BrightonSEO CHALLENGE - MISSING ALT TEXT SOLUTION - IMAGE CAPTIONING WITH PYTHIA

Slide 34

Slide 34 text

IMAGE CAPTIONING WITH PYTHIA @rvtheverett #BrightonSEO Pythia Modular Framework https://paperswithcode.com/paper/bottom-up-and-top-down-attention-for-image https://learnpythia.readthedocs.io/en/latest/

Slide 35

Slide 35 text

@rvtheverett #BrightonSEO IMAGE CAPTIONING WITH PYTHIA Google Colab Link

Slide 36

Slide 36 text

@rvtheverett #BrightonSEO IMAGE CAPTIONING WITH PYTHIA Google Colab Link

Slide 37

Slide 37 text

@rvtheverett #BrightonSEO IMAGE CAPTIONING WITH PYTHIA

Slide 38

Slide 38 text

@rvtheverett #BrightonSEO IMAGE CAPTIONING WITH PYTHIA

Slide 39

Slide 39 text

It’s not perfect though! @rvtheverett #BrightonSEO IMAGE CAPTIONING WITH PYTHIA

Slide 40

Slide 40 text

@rvtheverett #BrightonSEO CHALLENGE - LARGE IMAGE FILE SIZES SOLUTION - OPTIMISE IMAGES

Slide 41

Slide 41 text

OPTIMISE IMAGES WITH PILLOW @rvtheverett #BrightonSEO Pure Python using the Pillow library This script does optimise images destructively optimize-images filename.jpg Optimise a single image optimize-images ./ Optimise a folder with multiple images Github Link

Slide 42

Slide 42 text

OPTIMISE IMAGES WITH PILLOW @rvtheverett #BrightonSEO

Slide 43

Slide 43 text

OPTIMISE IMAGES WITH PILLOW @rvtheverett #BrightonSEO

Slide 44

Slide 44 text

OPTIMISE IMAGES WITH PILLOW @rvtheverett #BrightonSEO

Slide 45

Slide 45 text

OPTIMISE IMAGES WITH PILLOW @rvtheverett #BrightonSEO Original Optimised

Slide 46

Slide 46 text

@rvtheverett #BrightonSEO UNDERSTANDING PAGERANK

Slide 47

Slide 47 text

UNDERSTANDING PAGERANK @rvtheverett @DeepCrawl https://colab.research.google.com/drive/1zQ8VFcNmwVLKEMwJ3lhTginPoSC5TdpB

Slide 48

Slide 48 text

@rvtheverett @DeepCrawl https://colab.research.google.com/drive/1zQ8VFcNmwVLKEMwJ3lhTginPoSC5TdpB UNDERSTANDING PAGERANK

Slide 49

Slide 49 text

@rvtheverett #BrightonSEO No coding knowledge required!

Slide 50

Slide 50 text

OTHER POSSIBILITIES @rvtheverett #BrightonSEO Log File analysis Validate hreflang Identify duplicate URLs Perform competitor analysis Automate page speed audits

Slide 51

Slide 51 text

@rvtheverett #BrightonSEO Think about what you can automate!

Slide 52

Slide 52 text

@rvtheverett #BrightonSEO PAGESPEED API WITH PYTHON

Slide 53

Slide 53 text

@rvtheverett #BrightonSEO PAGESPEED API WITH PYTHON https://colab.research.google.com/drive/1Oe1VTocg21KIVDqROXSt15H6CoO905D0

Slide 54

Slide 54 text

PYTRENDS @rvtheverett #BrightonSEO

Slide 55

Slide 55 text

PYTRENDS @rvtheverett #BrightonSEO

Slide 56

Slide 56 text

OTHER FUN PYTHON PROJECTS @rvtheverett #BrightonSEO Create a bot using Python, Telegram and RandomDog API https://www.practicepython.org/ https://realpython.com/pygame-a-primer/ https://inventwithpython.com/pygame/

Slide 57

Slide 57 text

@rvtheverett #BrightonSEO AN INTRODUCTION TO MACHINE LEARNING FOR SEO

Slide 58

Slide 58 text

WHAT IS MACHINE LEARNING? @rvtheverett #BrightonSEO “Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.” https://www.expertsystem.com/machine-learning-definition/

Slide 59

Slide 59 text

POWERING MACHINE LEARNING @rvtheverett #BrightonSEO https://www.expertsystem.com/machine-learning-definition/ Run a script to train the computer, using a dataset

Slide 60

Slide 60 text

POWERING MACHINE LEARNING @rvtheverett #BrightonSEO https://www.expertsystem.com/machine-learning-definition/ Run a script to train the computer, using a dataset Summarise & Visualise the dataset

Slide 61

Slide 61 text

POWERING MACHINE LEARNING @rvtheverett #BrightonSEO https://www.expertsystem.com/machine-learning-definition/ Run a script to train the computer, using a dataset Summarise & Visualise the dataset Evaluate the algorithms

Slide 62

Slide 62 text

POWERING MACHINE LEARNING @rvtheverett #BrightonSEO https://www.expertsystem.com/machine-learning-definition/ Run a script to train the computer, using a dataset Summarise & Visualise the dataset Evaluate the algorithms Make Predictions

Slide 63

Slide 63 text

REAL WORLD MACHINE LEARNING EXAMPLES @rvtheverett #BrightonSEO RankBrain NLP Computer Vision BERT

Slide 64

Slide 64 text

REAL WORLD MACHINE LEARNING EXAMPLES @rvtheverett #BrightonSEO Twitter Curated Timelines

Slide 65

Slide 65 text

REAL WORLD MACHINE LEARNING EXAMPLES @rvtheverett #BrightonSEO Facebook Chatbots https://ipullrank.com/machine-learning-guide/how-to-set-up-a-chatbot/

Slide 66

Slide 66 text

REAL WORLD MACHINE LEARNING EXAMPLES @rvtheverett #BrightonSEO Personalised Recommendations https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76

Slide 67

Slide 67 text

REAL WORLD MACHINE LEARNING EXAMPLES @rvtheverett #BrightonSEO Personalised Recommendations https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76

Slide 68

Slide 68 text

@rvtheverett #BrightonSEO DATA IS THE FUEL FOR MACHINE LEARNING

Slide 69

Slide 69 text

SUPERVISED LEARNING @rvtheverett #BrightonSEO

Slide 70

Slide 70 text

SUPERVISED LEARNING @rvtheverett #BrightonSEO

Slide 71

Slide 71 text

SUPERVISED LEARNING

Slide 72

Slide 72 text

SUPERVISED LEARNING

Slide 73

Slide 73 text

UNSUPERVISED LEARNING @rvtheverett #BrightonSEO

Slide 74

Slide 74 text

UNSUPERVISED LEARNING @rvtheverett #BrightonSEO

Slide 75

Slide 75 text

UNSUPERVISED LEARNING

Slide 76

Slide 76 text

MACHINE LEARNING SIMPLIFIED @rvtheverett #BrightonSEO - Ethem Alpaydin Machine learning will help us make sense of an increasingly complex world. Already we are exposed to more data than what our sensors can cope with or our brains can process.

Slide 77

Slide 77 text

SEO POSSIBILITIES WITH MACHINE LEARNING @rvtheverett #BrightonSEO SEO Possibilities with Machine Learning Evaluating Content Quality Log File Analysis Predictive analysis Title Tag Optimisation User Engagement Insights Audio Transcribing

Slide 78

Slide 78 text

@rvtheverett #BrightonSEO PREDICTIVE PREFETCHING

Slide 79

Slide 79 text

PREDICTIVE PREFETCHING @rvtheverett #BrightonSEO https://guess-js.github.io/docs Automate the process of predictive prefetching

Slide 80

Slide 80 text

PREDICTIVE PREFETCHING @rvtheverett #BrightonSEO https://guess-js.github.io/docs Predict the next page a user is likely to visit and prefetch these pages.

Slide 81

Slide 81 text

PREDICTIVE PREFETCHING @rvtheverett #BrightonSEO https://guess-js.github.io/docs Predict the next page a user is likely to visit and prefetch these pages. Predict the next piece of content (article, product, video) a user is likely to want to view and adjust or filter the user experience to account for this.

Slide 82

Slide 82 text

PREDICTIVE PREFETCHING @rvtheverett #BrightonSEO https://guess-js.github.io/docs Predict the next page a user is likely to visit and prefetch these pages. Predict the next piece of content (article, product, video) a user is likely to want to view and adjust or filter the user experience to account for this. Predict the types of widgets an individual user is likely to interact with more (e.g games) and use this data to tailor a more custom experience.

Slide 83

Slide 83 text

@rvtheverett #BrightonSEO INTERNAL LINKING

Slide 84

Slide 84 text

INTERNAL LINKING @rvtheverett #BrightonSEO Crawl to identify broken internal links Algorithm to suggest the most accurate replacement page Replace broken internal links

Slide 85

Slide 85 text

INTERNAL LINKING @rvtheverett #BrightonSEO

Slide 86

Slide 86 text

@rvtheverett #BrightonSEO CONTENT QUALITY

Slide 87

Slide 87 text

CONTENT QUALITY @rvtheverett #BrightonSEO Search Volume Uniqueness Freshness Internal Links Word Count Search Traffic Heading Tags Time on page Bounce Rate Conversion Rate Model generates insights on the factors that are most important.

Slide 88

Slide 88 text

CONTENT QUALITY @rvtheverett #BrightonSEO Important content factors Machine Learning Model Content Quality Score

Slide 89

Slide 89 text

@rvtheverett #BrightonSEO USER EXPERIENCE

Slide 90

Slide 90 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Sentiment analysis - Instagram bullying language

Slide 91

Slide 91 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Image cropping - Twitter

Slide 92

Slide 92 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Image cropping - Twitter

Slide 93

Slide 93 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Image cropping - Twitter

Slide 94

Slide 94 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Image cropping - Twitter

Slide 95

Slide 95 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Computer Vision

Slide 96

Slide 96 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Computer Vision - Making images accessible

Slide 97

Slide 97 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Chatbots - Helping users find the most useful content

Slide 98

Slide 98 text

USER EXPERIENCE @rvtheverett #BrightonSEO https://github.com/mgechev/guess-next Chatbots - Helping users find the most useful content Remember trust is important - let users know if they talking to a bot rather than a human

Slide 99

Slide 99 text

@rvtheverett #BrightonSEO NATURAL LANGUAGE PROCESSING

Slide 100

Slide 100 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO Google’s NLP Model Natural Language uses machine learning to reveal the structure and meaning of text. Analyses text to understand the sentiment, as well as extract key information. https://cloud.google.com/natural-language/

Slide 101

Slide 101 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://cloud.google.com/natural-language/

Slide 102

Slide 102 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://cloud.google.com/natural-language/

Slide 103

Slide 103 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://cloud.google.com/natural-language/

Slide 104

Slide 104 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://github.com/BritneyMuller/colab-notebooks @BritneyMuller

Slide 105

Slide 105 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://github.com/BritneyMuller/colab-notebooks Entity Salience

Slide 106

Slide 106 text

MACHINE LEARNING TOOLS @rvtheverett #BrightonSEO https://github.com/BritneyMuller/colab-notebooks Entity Categorisation

Slide 107

Slide 107 text

@rvtheverett #BrightonSEO https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0 IMAGE CATEGORISATION

Slide 108

Slide 108 text

TENSOR FLOW FOR POETS @rvtheverett #BrightonSEO https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0 Retrain an already trained model using transfer learning for a similar problem. Train a simple classifier to classify images of flowers.

Slide 109

Slide 109 text

TENSOR FLOW FOR POETS @rvtheverett #BrightonSEO https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0

Slide 110

Slide 110 text

TENSOR FLOW FOR POETS @rvtheverett #BrightonSEO https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0

Slide 111

Slide 111 text

@rvtheverett #BrightonSEO THE FUTURE OF SEO Understand and solve problems faster

Slide 112

Slide 112 text

@rvtheverett #BrightonSEO THE FUTURE OF SEO Make data driven decisions

Slide 113

Slide 113 text

@rvtheverett #BrightonSEO THE FUTURE OF SEO Focus on other important optimisation activities

Slide 114

Slide 114 text

@rvtheverett #BrightonSEO THE FUTURE OF SEO Improve user experience

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

TALK TO YOUR DEVELOPERS

Slide 117

Slide 117 text

JOIN COMMUNITIES https://pyslackers.com/web

Slide 118

Slide 118 text

https://www.100daysofcode.com/ KEEP PRACTICING AND HAVE FUN

Slide 119

Slide 119 text

PEOPLE TO FOLLOW @britneymuller @hamletbatista @TylerReardon @DataChaz @dawnieando @jroakes @jessthebp @aysunakarsu @math_rachel

Slide 120

Slide 120 text

DEEPCRAWL PROFESSIONAL SERVICES @BermanHale @allophonousrex @rachelleighrva @NeilDesai @theJimmyB0b @Rick_BarK

Slide 121

Slide 121 text

KEY TAKEAWAYS @rvtheverett #BrightonSEO Python can help technical SEOs increase their efficiency. Being able to better understand data will lead to better decisions being made. Anyone can learn Python, with a little commitment. Have fun with it and see what you can create.

Slide 122

Slide 122 text

@rvtheverett #BrightonSEO

Slide 123

Slide 123 text

USEFUL RESOURCES @rvtheverett #BrightonSEO https://www.python.org/ https://www.searchenginejournal.com/python-seo-data-reference-guide/287927/ https://www.searchenginewatch.com/2019/02/06/using-python-to-recover-seo-site-traffic-part-one/ https://cs109.github.io/2015/ https://www.deepcrawl.com/blog/webinars/scaling-automated-quality-text-generation-for-enterprise-sites/ https://automatetheboringstuff.com/ https://towardsdatascience.com/beginners-guide-to-machine-learning-with-python-b9ff35bc9c51 https://www.searchenginejournal.com/python-technical-seo/330515 https://www.searchenginejournal.com/introduction-to-python-seo-spreadsheets/342779/ https://www.fullstackpython.com/ https://www.tensorflow.org/learn

Slide 124

Slide 124 text

THANK YOU #BrightonSEO Ruth Everett Technical SEO Analyst @rvtheverett // @deepcrawl