My talk from SMX London 2024 titled "Navigating the AI Revolution" puts an emphasis on automating and scaling real-world (technical-) SEO problems and tasks. Enjoy!
give an immediate sense of what generative AI is currently being used for: Source: https://pa.ag/4brddtI • Technical Assistance & Troubleshooting (23%) • Content Creation & Editing (22%) • Personal & Professional Support (17%) • Learning & Education (15%) • Creativity & Recreation (13%) • Research, Analysis & Decision Making (10%)
large volumes of text data to extract relevant information, summarise key points, and answer questions, making them valuable for research, data analysis, and decision-making support. Personalised Recommendations LLMs can analyse user preferences and behaviour to provide personalised recommendations, such as articles or products, thus enhancing UX and engagement. Natural Language Processing LLMs excel in understanding language, making them ideal for applications such as chat bots, language translation, sentiment analysis, and text summarisation. What are LLMs good at?
perform well in situations requiring an understanding of context or knowledge beyond their original training data set. Making Ethical or Moral Judgments LLMs lack the ability to make ethical or moral judgments and should not be used in situations where such considerations are crucial. Most LLMs’ decisions are also biased. Limited Understanding and Reasoning LLMs can't form a chain of logical conclusions, instead they’re following probability rules; even if the most common answer to a question is irrational or outright wrong, it will still provide said answer. What are LLMs NOT good at?
to be managed (Obviously, this is true for both commercial and open-source models) Source: https://pa.ag/3Td5ucz Consent Ensuring training data is gathered responsibly, in compliance with AI governance and regulations. Security Security risks include data leaks or malicious use of LLMs by criminals. Bias Happens when the data source is not diverse or representative enough. Hallucinations Ensuring training data complies with AI governance and regulations.
mismatch between the technology and the proposed use cases," says Emily Bender, professor in the Department of Linguistics and director of the Computational Linguistics Laboratory at the University of Washington. Source: https://pa.ag/3PqP0Mh LLMs are designed to predict the next word – of course there will be cases where the model is wrong.
LLMs don’t “write” anything. They generate text based on probabilities and the number of parameters used in their training, using content they've encountered before.
and drop apps to automate existing workflows or build new complex processes to save time: e.g., create short form social media copy based on your WP posts using ChatGPT, then send to LinkedIn, FB, etc. Source: https://pa.ag/3Usq4oI
Export to Excel (or Sheets) and filter out irrelevant links (e.g., to 404s, noindex, etc.). Once done, feed the data into Gephi and watch the magic unfold: Gephi is a powerful tool that transforms complex data into dynamic visual networks, making hidden connections instantly visible.
insights by visualising your website's internal linking, hierarchy, and topical structure, automatically calculating PageRank and Modularity metrics: Source: Gephi Gephi in a nutshell: ▪ Turn raw data into actionable insights with just a few clicks ▪ Uncover internal linking patterns and relationships with Gephi’s visual maps ▪ Pinpoint key nodes and clusters using PageRank and Modularity classes Structure of most "regular" websites:
brief overview of key concepts and their significance While PageRank focuses on the importance of individual nodes (pages), Modularity helps identify groups of interconnected nodes (communities). PageRank ▪ PageRank (in this case) is calculated within a single website ▪ The calculation is based on how pages are linked to each other ▪ Pages with many incoming links from other authoritative pages have a higher PageRank Modularity ▪ Modularity measures how well a network decomposes into modular communities ▪ In the context of website analysis, communities represent groups of closely related pages within a site ▪ The objective is to minimise clusters, while ensuring that each cluster contains only thematically related pages
script lets you test optimisations by calculating PageRank for different scenarios, showing URL relevance changes before implementation: Script: https://pa.ag/4e28PTG URL Internal PageRank New Internal PageRank index pages only Change in % /kontakt/ 0.01653 0.00076 - 95.4% /presse/ 0.00779 - - 100% /auszeichnungen/ 0.00582 - - 100% /einlagensicherung/ 0.00552 0.00797 + 44.3% /depot/ 0.00531 0.00797 + 50.0% /fondsuebersicht/ 0.00528 0.00795 + 50.5% /etf/ 0.00517 0.00797 + 54.1% /aktien/ 0.00511 0.00795 + 55.5% /steuern/ 0.00501 0.00915 + 82.6% The main difference is that the script focuses on relevance and PageRank, while Gephi also accounts for thematic relationships through Modularity.
and Modularity data, provide ChatGPT with insights on whicyh landing pages need better linking, and let AI recalculate for optimised results, streamlining the process for efficiency and speed. Export data from Gephi (PageRank, Modularity) Visualise the new data ChatGPT optimises the data based on your requirements
next step, provide corresponding instructions to ChatGPT - which pages or areas should be focused on? Which should be removed? Where to focus for better interlinking? In our case: remove certain sections from linking entirely, include specific pages we consider important, and provide focus areas or core topics for more prominent overall linking.
as new PageRank and Modularity calculations reveal the optimised internal linking structure and significantly improved topical clustering between pages:
results By leveraging AI (ChatGPT) and smart automation this process becomes faster, more efficient, and more accurate Key-benefits include: ▪ Incredibly easy to use, suitable for junior- and mid-level SEOs ▪ Data-driven and far more accurate than “traditional guesswork” ▪ Significantly faster than manual methods, freeing up valuable time for other tasks ▪ Manually evaluating topical clusters in internal linking is nearly impossible, especially on large-scale websites Sistrix visibility development after deployment:
steps for better automated redirects (and an improved customer journey): Extract main content of every (old) site/URL Generate embeddings Save together with metadata in vector database Semantic search in vector DB based on embeddings of old URLs
<title> tag <h1>s each first & last sentence <p> <h2>s <h2>s Combine everything Content = Title + h1 + h2s + … ▪ Extract: <title> + main content ▪ Combine: <title>, <h1>, <h2>s and first & last sentence of each paragraph
match For every outdated page: ▪ Vectoric semantic search for KNN (k-nearest neighbour) ▪ Set 301 to NN URL ▪ No more weak redirects ▪ Play with certainty/ temperature settings 0.31 … -0.41 {Get { Article ( nearVector: { limit: 1, content: { vector:[embedding], certainty: 0.8 } } ) { url } }} Future 404
out of the box In my defence, I still believe it’s crucial to understand what's happening “behind the scenes”… Source: https://pa.ag/40fdlu2 Turn on JS Rendering, then head to Configuration > Custom > Custom JavaScript: Select Add from Library > (ChatGPT) Extract embeddings […] > Click on “JS” to open the code and add your OpenAI key:
Levenshtein distance (basic fuzzy matching) provides an alternative, as we’re mainly dealing with small text snippets and minimal deviations between URL versions: Source: https://pa.ag/49RHG3y The more substantial the changes between two versions, the higher the likelihood that you’ll reap significant benefits from leveraging sentence transformers. h/t Will Nye for the data set
by Daniel Emery utilising two SF crawls (origin + destination.csv with titles, metas, URLs and headings) to perform a fast semantic search (using sentence transformers) and create a redirect map: Sources: https://pa.ag/4bWAgxy & https://pa.ag/3USteUJ FAISS is an outstanding library designed for the fast retrieval of nearest neighbours in high- dimensional spaces. It enables quick semantic nearest neighbour searches even on a large scale.
I already spoke about using CF Workers for a variety of technical SEO tasks including redirects at the SMX Advanced in Berlin back in 2021. Looking to dive deeper? Make sure to grab a copy of the deck: Source: https://pa.ag/4bSxauE Pro tip: this rarely requires dev resources; either you can do it yourself, or use sys ops (less busy)
the JSON? You need significantly fewer instructions (per prompt), such as a specific context, as you already provided these details when you created/trained/set up the Custom GPT:
party API Head to your API provider and grab your credentials. In our case this was the API Dashboard at DataForSEO.com: Get the OpenAPI Schema for DataForSEO: https://pa.ag/3Pa7oZ3 To use with an action, you need to generate a base64- encoded version of your login credentials: btoa(‘APIemail:APIpass’) The annoying part: you need a Schema according to the OpenAPI spec. But no one reads docs anymore – we just leverage ChatGPT to do this:
prompts? You can also link directly to pre-filled prompts and execute them. This works for both Custom GPTs and GPT-4o models. Simply add the query string (using “q=xxx“) to the end of your ChatGPT URL. Source: https://pa.ag/crsum 𝗙𝗼𝗿 any C𝘂𝘀𝘁𝗼𝗺 𝗚𝗣𝗧 𝗮𝗱𝗱: ?q=your+prompt+goes+here 𝗙𝗼𝗿 the 𝗚𝗣𝗧-𝟰o 𝗺𝗼𝗱𝗲𝗹: ?model=gpt-4o&q=your+prompt Use directly in your Chrome browser
Custom GPTs are a powerful tool to ensure that instructions remain contextualised over long periods of time. In addition to seamless third-party data integration, here are my top three reasons why building and using Custom GPTs is highly beneficial. Building workflows Custom GPTs are ideal for creating workflows for individuals who may not know how to effectively design contextual prompt sequences. Sharing instructions For sharing the same instructions across teams, without having to worry about specifying them (or how) at the prompt level.
traffic For this to work, we‘ll need a make.com Custom Webhook, OpenAPI spec for said Webhook and a Custom GPT which acts the frontend to forward your data: ActionsGPT: https://pa.ag/4eS196T - or copy the spec: https://pa.ag/3NBGp7D New Scenario > Custom Webhook > Create Use OpenAI’s ActionsGPT + the prompt below:
For individual tasks and client teams working on specific projects – all aimed at driving efficiency and streamlining processes. A few examples are listed below, though there are many more...