Navigating the AI Revolution - SMX London 2024

Slide 1

Slide 1 text

Navigating the AI Revolution Elevating Your SEO with a Touch of Magic Bastian Grimm | Peak Ace AG | @basgr 25th October 2024

Slide 2

Slide 2 text

We are on the brink of the most profound tech-transition in human history.

Slide 3

Slide 3 text

From healthcare to education and jobs – hardly any aspect of our everyday lives will remain unaffected by AI. AI is changing everything

Slide 4

Slide 4 text

The innovations of today are built on the technological breakthroughs made in recent years

Slide 5

Slide 5 text

5 peakace.agency Vast amounts of data have been published online for decades – all of which are available as training material for LLMs. #1

Slide 6

Slide 6 text

Machines are now able to take over various human activities and make precise predictions. Progress in Deep Learning #2

Slide 7

Slide 7 text

Significant advances in cloud, infrastructure and consumer technology are making it easier and cheaper than ever to develop and deploy AI technology. Resource Availability #3

Slide 8

Slide 8 text

Those advancements will not only have an impact on AI development, but also on our work life

Slide 9

Slide 9 text

9 peakace.agency With generative AI, 30% of the hours worked today could be automated by 2030. Source: https://pa.ag/3FbhX8J

Slide 10

Slide 10 text

10 peakace.agency How people are using GenAI Six top-level themes give an immediate sense of what generative AI is currently being used for: Source: https://pa.ag/4brddtI • Technical Assistance & Troubleshooting (23%) • Content Creation & Editing (22%) • Personal & Professional Support (17%) • Learning & Education (15%) • Creativity & Recreation (13%) • Research, Analysis & Decision Making (10%)

Slide 11

Slide 11 text

For SEO, this is already the case.

Slide 12

Slide 12 text

I appreciate that you‘re all extremely busy…

Slide 13

Slide 13 text

This you?

Slide 14

Slide 14 text

50,000+

Slide 15

Slide 15 text

70,000+

Slide 16

Slide 16 text

My goal for today’s session: Give you back some of your extremely valuable time!

Slide 17

Slide 17 text

17 peakace.agency What have I brought for you today? SEO Automation Internal Linking Redirects Custom GPTs

Slide 18

Slide 18 text

Before we dive in, let’s have a look at the key ingredients

Slide 19

Slide 19 text

What are Large Language Models (LLMs)?

Slide 20

Slide 20 text

Large Language Models (LLMs) are AI systems trained on vast data sets (thus “large”) to understand, predict and generate data using transformer-based neural networks. Simply put:

Slide 21

Slide 21 text

What are LLMs good at?

Slide 22

Slide 22 text

22 peakace.agency Information Retrieval and Analysis LLMs can sift through large volumes of text data to extract relevant information, summarise key points, and answer questions, making them valuable for research, data analysis, and decision-making support. Personalised Recommendations LLMs can analyse user preferences and behaviour to provide personalised recommendations, such as articles or products, thus enhancing UX and engagement. Natural Language Processing LLMs excel in understanding language, making them ideal for applications such as chat bots, language translation, sentiment analysis, and text summarisation. What are LLMs good at?

Slide 23

Slide 23 text

What are LLMs NOT good at?

Slide 24

Slide 24 text

24 peakace.agency Understanding Context Beyond Training Data LLMs may not perform well in situations requiring an understanding of context or knowledge beyond their original training data set. Making Ethical or Moral Judgments LLMs lack the ability to make ethical or moral judgments and should not be used in situations where such considerations are crucial. Most LLMs’ decisions are also biased. Limited Understanding and Reasoning LLMs can't form a chain of logical conclusions, instead they’re following probability rules; even if the most common answer to a question is irrational or outright wrong, it will still provide said answer. What are LLMs NOT good at?

Slide 25

Slide 25 text

25 peakace.agency Keep in mind: There are risks that need to be managed (Obviously, this is true for both commercial and open-source models) Source: https://pa.ag/3Td5ucz Consent Ensuring training data is gathered responsibly, in compliance with AI governance and regulations. Security Security risks include data leaks or malicious use of LLMs by criminals. Bias Happens when the data source is not diverse or representative enough. Hallucinations Ensuring training data complies with AI governance and regulations.

Slide 26

Slide 26 text

26 peakace.agency Will hallucinations ever disappear? "It’s inherent in the mismatch between the technology and the proposed use cases," says Emily Bender, professor in the Department of Linguistics and director of the Computational Linguistics Laboratory at the University of Washington. Source: https://pa.ag/3PqP0Mh LLMs are designed to predict the next word – of course there will be cases where the model is wrong.

Slide 27

Slide 27 text

27 peakace.agency LLMs are not good at creating original content LLMs don’t “write” anything. They generate text based on probabilities and the number of parameters used in their training, using content they've encountered before.

Slide 28

Slide 28 text

In all the following cases, AI/ML and related technologies will be combined with other ingredients to elevate the final product's flavor and complexity Enough theory, now let’s get cooking

Slide 29

Slide 29 text

No-Code SEO Automation #1 Core Ingredients:

Slide 30

Slide 30 text

30 peakace.agency Source: https://pa.ag/3Usq4oI Have you tried Make.com? Boost productivity across every area or team. Use Make to design powerful workflows without having to rely on developer resources.

Slide 31

Slide 31 text

31 peakace.agency Tons of pre-built modules Scraping, parsing, reading/writing/storing data in different formats & sources, any-2-any API connections, etc.

Slide 32

Slide 32 text

32 peakace.agency More than "just" no-code workflow automation Simply drag and drop apps to automate existing workflows or build new complex processes to save time: e.g., create short form social media copy based on your WP posts using ChatGPT, then send to LinkedIn, FB, etc. Source: https://pa.ag/3Usq4oI

Slide 33

Slide 33 text

33 peakace.agency WordPress is dominating the CMS market Source: https://pa.ag/3BQXCr1 43.5% of all websites are using WP (based on W3Tech‘s data)

Slide 34

Slide 34 text

34 peakace.agency Source: https://pa.ag/3UgCdxO Lots of pre-built functions for WP out of the box Before you can use any of these, you need to install the WP plug-in and specify an auth-key

Slide 35

Slide 35 text

"Make an API Call" The one that excites me the most?

Slide 36

Slide 36 text

36 peakace.agency Query (your) WordPress through its built-in API e.g. fetching (pre-selected/filtered) posts from your website, or literally anything else you can think of: Source: https://pa.ag/4dLEO9L

Slide 37

Slide 37 text

37 peakace.agency Run Make.com‘s WordPress module (later, auto-schedule) You'll get all the details for a single post, from the title and content to the metadata and more:

Slide 38

Slide 38 text

You‘ve got the data. Great, but now what?

Slide 39

Slide 39 text

How about Google Search Console, just for fun? Pass it to any other module you can possibly imagine

Slide 40

Slide 40 text

40 peakace.agency Before being able to query GSC, you need to transform WordPress’ API response:

Slide 41

Slide 41 text

41 peakace.agency The built-in JSON parser is here to help! The WP API call returns a JSON response by default, which means we need to access the response body and get the link-attribute value (as string/text):

Slide 42

Slide 42 text

42 peakace.agency This took me almost no time to build: Live indexing check using GSC API

Slide 43

Slide 43 text

Generating alternative copywriting titles (per URL) based on suggestions from Google Search Console to improve CTR Endless possibilities

Slide 44

Slide 44 text

Creating/suggesting FAQ sections for your existing content including structured data Endless possibilities

Slide 45

Slide 45 text

Sentiment analysis using Google Cloud Natural Language, with insights from Gemini (Vertex AI) on how to improve content Endless possibilities

Slide 46

Slide 46 text

You get the idea…

Slide 47

Slide 47 text

Internal Linking #2 Core Ingredients:

Slide 48

Slide 48 text

Start small, validate initial ideas & concepts, then strategically scale up where feasible. For any of the cases

Slide 49

Slide 49 text

There’s often more than one solution to achieve the same goal

Slide 50

Slide 50 text

How well does your internal linking reflect your website’s key topics, and how can you find out?

Slide 51

Slide 51 text

Analysing and understanding topical clusters is crucial for assessing relevance and thematic context of internal links. For background:

Slide 52

Slide 52 text

52 peakace.agency Collect internal URL inventory with a crawling tool Export to Excel (or Sheets) and filter out irrelevant links (e.g., to 404s, noindex, etc.). Once done, feed the data into Gephi and watch the magic unfold: Gephi is a powerful tool that transforms complex data into dynamic visual networks, making hidden connections instantly visible.

Slide 53

Slide 53 text

53 peakace.agency Gephi turns data into knowledge Gephi delivers data-driven insights by visualising your website's internal linking, hierarchy, and topical structure, automatically calculating PageRank and Modularity metrics: Source: Gephi Gephi in a nutshell: ▪ Turn raw data into actionable insights with just a few clicks ▪ Uncover internal linking patterns and relationships with Gephi’s visual maps ▪ Pinpoint key nodes and clusters using PageRank and Modularity classes Structure of most "regular" websites:

Slide 54

Slide 54 text

54 peakace.agency PageRank & Modularity illustrate authority and clusters A brief overview of key concepts and their significance While PageRank focuses on the importance of individual nodes (pages), Modularity helps identify groups of interconnected nodes (communities). PageRank ▪ PageRank (in this case) is calculated within a single website ▪ The calculation is based on how pages are linked to each other ▪ Pages with many incoming links from other authoritative pages have a higher PageRank Modularity ▪ Modularity measures how well a network decomposes into modular communities ▪ In the context of website analysis, communities represent groups of closely related pages within a site ▪ The objective is to minimise clusters, while ensuring that each cluster contains only thematically related pages

Slide 55

Slide 55 text

Using Python to calculate existing PageRank and simulate potential future changes from internal linking adjustments Another solution

Slide 56

Slide 56 text

56 peakace.agency PageRank recalculation prior to implementation Johan von Hülsen's script lets you test optimisations by calculating PageRank for different scenarios, showing URL relevance changes before implementation: Script: https://pa.ag/4e28PTG URL Internal PageRank New Internal PageRank index pages only Change in % /kontakt/ 0.01653 0.00076 - 95.4% /presse/ 0.00779 - - 100% /auszeichnungen/ 0.00582 - - 100% /einlagensicherung/ 0.00552 0.00797 + 44.3% /depot/ 0.00531 0.00797 + 50.0% /fondsuebersicht/ 0.00528 0.00795 + 50.5% /etf/ 0.00517 0.00797 + 54.1% /aktien/ 0.00511 0.00795 + 55.5% /steuern/ 0.00501 0.00915 + 82.6% The main difference is that the script focuses on relevance and PageRank, while Gephi also accounts for thematic relationships through Modularity.

Slide 57

Slide 57 text

Manually optimising for better Modularity and PageRank distribution is time-consuming and error-prone.

Slide 58

Slide 58 text

58 peakace.agency Optimise PageRank and Modularity with ChatGPT Use PageRank and Modularity data, provide ChatGPT with insights on whicyh landing pages need better linking, and let AI recalculate for optimised results, streamlining the process for efficiency and speed. Export data from Gephi (PageRank, Modularity) Visualise the new data ChatGPT optimises the data based on your requirements

Slide 59

Slide 59 text

59 peakace.agency AI-powered internal linking: ChatGPT ‘knows’ Gephi data Leverage ChatGPT to analyse and optimise internal linking by uploading PageRank and Modularity data directly from Gephi.

Slide 60

Slide 60 text

60 peakace.agency AI-powered internal linking: provide focus areas For the next step, provide corresponding instructions to ChatGPT - which pages or areas should be focused on? Which should be removed? Where to focus for better interlinking? In our case: remove certain sections from linking entirely, include specific pages we consider important, and provide focus areas or core topics for more prominent overall linking.

Slide 61

Slide 61 text

61 peakace.agency AI-powered internal linking: download new dataset Based your instructions, ChatGPT optimises the existing data and completely recalculates the PageRank and Modularity for the URLs and topics

Slide 62

Slide 62 text

62 peakace.agency Compare old vs new (utilising Gephi again) Watch as new PageRank and Modularity calculations reveal the optimised internal linking structure and significantly improved topical clustering between pages:

Slide 63

Slide 63 text

63 peakace.agency The ’AI Revolution’ in internal linking: faster, better results By leveraging AI (ChatGPT) and smart automation this process becomes faster, more efficient, and more accurate Key-benefits include: ▪ Incredibly easy to use, suitable for junior- and mid-level SEOs ▪ Data-driven and far more accurate than “traditional guesswork” ▪ Significantly faster than manual methods, freeing up valuable time for other tasks ▪ Manually evaluating topical clusters in internal linking is nearly impossible, especially on large-scale websites Sistrix visibility development after deployment:

Slide 64

Slide 64 text

Redirects #3 Core Ingredients:

Slide 65

Slide 65 text

AI-powered redirect mapping is much faster and more efficient than using spreadsheets

Slide 66

Slide 66 text

66 peakace.agency Embeddings and vector database = redirect win Necessary steps for better automated redirects (and an improved customer journey): Extract main content of every (old) site/URL Generate embeddings Save together with metadata in vector database Semantic search in vector DB based on embeddings of old URLs

Slide 67

Slide 67 text

Embeddings are numerical vectors representing words, capturing their meanings and relationships in a multidimensional space. What are embeddings?

Slide 68

Slide 68 text

You can convert any word into a vector and start calculating with them: "king" minus "man" plus "woman" equals "queen". Synonyms and more can also be found this way. What are embeddings?

Slide 69

Slide 69 text

A vector DB utilises data embeddings as index, facilitating fast and scalable searches among unstructured data points, enhancing efficiency in retrieving similar items or information. What about vector databases?

Slide 70

Slide 70 text

A vector DB allows you to find matches between anything and anything (e.g., use an image as a query to find similar pieces of text, video, other images, etc.). Simply put:

Slide 71

Slide 71 text

A quick, step-by-step overview: Putting it all together

Slide 72

Slide 72 text

72 peakace.agency Extracting the main content of every old URL tag <h1>s each first & last sentence <p> <h2>s <h2>s Combine everything Content = Title + h1 + h2s + … ▪ Extract: <title> + main content ▪ Combine: <title>, <h1>, <h2>s and first & last sentence of each paragraph

Slide 73

Slide 73 text

73 peakace.agency Generate embeddings and store in vector database For each website URL: ▪ Transfer previously generated content to vector DB ▪ Generate embeddings (BERT, GloVe, FastText) ▪ Save embeddings in a vector DB incl. metadata (URL, title, etc.) Content Content Content 0.03 … 0.19 -0.21 … 0.03 0.08 … -0.15

Slide 74

Slide 74 text

74 peakace.agency Search the vector database for the best semantic match For every outdated page: ▪ Vectoric semantic search for KNN (k-nearest neighbour) ▪ Set 301 to NN URL ▪ No more weak redirects ▪ Play with certainty/ temperature settings 0.31 … -0.41 {Get { Article ( nearVector: { limit: 1, content: { vector:[embedding], certainty: 0.8 } } ) { url } }} Future 404

Slide 75

Slide 75 text

I know… efficiency and all! Don‘t want to do this all by hand?

Slide 76

Slide 76 text

76 peakace.agency ScreamingFrog has a killer feature to do this out of the box In my defence, I still believe it’s crucial to understand what's happening “behind the scenes”… Source: https://pa.ag/40fdlu2 Turn on JS Rendering, then head to Configuration > Custom > Custom JavaScript: Select Add from Library > (ChatGPT) Extract embeddings […] > Click on “JS” to open the code and add your OpenAI key:

Slide 77

Slide 77 text

Down the rabbit hole…

Slide 78

Slide 78 text

78 peakace.agency State-of-the-art sentence transformers are the gold standard The Levenshtein distance (basic fuzzy matching) provides an alternative, as we’re mainly dealing with small text snippets and minimal deviations between URL versions: Source: https://pa.ag/49RHG3y The more substantial the changes between two versions, the higher the likelihood that you’ll reap significant benefits from leveraging sentence transformers. h/t Will Nye for the data set

Slide 79

Slide 79 text

Calculating similarity scores across multiple elements and selecting the best matches always works best. Rule of thumb

Slide 80

Slide 80 text

… you need solid QA afterwards! Whatever you choose…

Slide 81

Slide 81 text

Garbage in = garbage out! Don‘t forget about input quality

Slide 82

Slide 82 text

Analyse page contents and automatically create redirect maps based on two (old vs new) SF crawls. Facebook AI Similarity Search (FAISS)

Slide 83

Slide 83 text

83 peakace.agency Automated redirect matchmaker for site migrations Fantastic script by Daniel Emery utilising two SF crawls (origin + destination.csv with titles, metas, URLs and headings) to perform a fast semantic search (using sentence transformers) and create a redirect map: Sources: https://pa.ag/4bWAgxy & https://pa.ag/3USteUJ FAISS is an outstanding library designed for the fast retrieval of nearest neighbours in high- dimensional spaces. It enables quick semantic nearest neighbour searches even on a large scale.

Slide 84

Slide 84 text

Not 100% perfect, but ~90% accurate/sensible matches are perfectly realistic. Significant time savings

Slide 85

Slide 85 text

As with most things, it can boost efficiency, but it isn't a complete replacement for a human.

Slide 86

Slide 86 text

Be smart with your redirects: put them on the edge

Slide 87

Slide 87 text

87 peakace.agency Cloudflare Workers to execute redirects on CDN/edge level I already spoke about using CF Workers for a variety of technical SEO tasks including redirects at the SMX Advanced in Berlin back in 2021. Looking to dive deeper? Make sure to grab a copy of the deck: Source: https://pa.ag/4bSxauE Pro tip: this rarely requires dev resources; either you can do it yourself, or use sys ops (less busy)

Slide 88

Slide 88 text

Core Ingredients: Custom GPTs for ChatGPT #4

Slide 89

Slide 89 text

Custom GPTs are a way to create tailored, custom versions of ChatGPT that combine instructions, extra knowledge, and any combination of skills. What are Custom GPTs (for ChatGPT)?

Slide 90

Slide 90 text

90 peakace.agency A Custom GPT in its simplest form: Using Peak Ace’s Structured Data GPT to debug and fix errors in JSON-LD mark-up

Slide 91

Slide 91 text

91 peakace.agency Noticed how I provided no instructions to fix the JSON? You need significantly fewer instructions (per prompt), such as a specific context, as you already provided these details when you created/trained/set up the Custom GPT:

Slide 92

Slide 92 text

Greatly increase your teams‘ productivity with Custom GPTs (and pre-defined workflows)

Slide 93

Slide 93 text

93 peakace.agency Making GPTs smarter with external data A Custom GPT can also be used to fetch additional information from a third-party data-source via API:

Slide 94

Slide 94 text

Here‘s a quick three-step guide on how to DIY it. So, how can you build this yourself?

Slide 95

Slide 95 text

95 peakace.agency #1 Provide basic info to get started (name, description, …) Log in to ChatGPT > choose Explore GPTs > Create (you need ChatGPT Plus) Well defined instructions are key, think prompting.

Slide 96

Slide 96 text

96 peakace.agency #2 Create an ‘Action’ to call a 3rd party API Head to your API provider and grab your credentials. In our case this was the API Dashboard at DataForSEO.com: Get the OpenAPI Schema for DataForSEO: https://pa.ag/3Pa7oZ3 To use with an action, you need to generate a base64- encoded version of your login credentials: btoa(‘APIemail:APIpass’) The annoying part: you need a Schema according to the OpenAPI spec. But no one reads docs anymore – we just leverage ChatGPT to do this:

Slide 97

Slide 97 text

Remember: APIs usually aren‘t free, so make sure you only publish your new Custom GPT for yourself! #3 Test and publish your GPT

Slide 98

Slide 98 text

Just reauthenticate (base64-encoded version of your login). You also need a new schema (again based on OAS spec). Customisation for other APIs is easy (e.g., Sistrix, etc.)

Slide 99

Slide 99 text

99 peakace.agency Did you know you can link using pre-filled prompts? You can also link directly to pre-filled prompts and execute them. This works for both Custom GPTs and GPT-4o models. Simply add the query string (using “q=xxx“) to the end of your ChatGPT URL. Source: https://pa.ag/crsum 𝗙𝗼𝗿 any C𝘂𝘀𝘁𝗼𝗺 𝗚𝗣𝗧 𝗮𝗱𝗱: ?q=your+prompt+goes+here 𝗙𝗼𝗿 the 𝗚𝗣𝗧-𝟰o 𝗺𝗼𝗱𝗲𝗹: ?model=gpt-4o&q=your+prompt Use directly in your Chrome browser

Slide 100

Slide 100 text

100 peakace.agency When to use a Custom GPT? Long-term context Custom GPTs are a powerful tool to ensure that instructions remain contextualised over long periods of time. In addition to seamless third-party data integration, here are my top three reasons why building and using Custom GPTs is highly beneficial. Building workflows Custom GPTs are ideal for creating workflows for individuals who may not know how to effectively design contextual prompt sequences. Sharing instructions For sharing the same instructions across teams, without having to worry about specifying them (or how) at the prompt level.

Slide 101

Slide 101 text

How about we get rid of the annoying "copy and paste" when using Chat GPT? Remember my promise to give you some of your time back?

Slide 102

Slide 102 text

102 peakace.agency Let‘s set up Make.com to listen to ChatGPT traffic For this to work, we‘ll need a make.com Custom Webhook, OpenAPI spec for said Webhook and a Custom GPT which acts the frontend to forward your data: ActionsGPT: https://pa.ag/4eS196T - or copy the spec: https://pa.ag/3NBGp7D New Scenario > Custom Webhook > Create Use OpenAI’s ActionsGPT + the prompt below:

Slide 103

Slide 103 text

103 peakace.agency You all know the drill: setting up a new Custom GPT

Slide 104

Slide 104 text

104 peakace.agency Before we run it, let’s decide where to send the data. Let’s store it in Google Sheets and add a new row for each response that ChatGPT produces:

Slide 105

Slide 105 text

105 peakace.agency Let‘s give this a try, shall we? Ask your Custom GPT anything, which will then send the data to your sheet automatically:

Slide 106

Slide 106 text

106 peakace.agency If you need to dive deeper, check the data flow in Make.com Each module provides its own specific output including operational details and the actual data going through:

Slide 107

Slide 107 text

From now on: no more copy-pasting needed!

Slide 108

Slide 108 text

108 peakace.agency At Peak Ace, we use Custom GPTs everywhere For individual tasks and client teams working on specific projects – all aimed at driving efficiency and streamlining processes. A few examples are listed below, though there are many more...

Slide 109

Slide 109 text

Thank you! Ciao, London. =