Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything Google Lied to Us About

Everything Google Lied to Us About

Michael King

March 06, 2024
Tweet

More Decks by Michael King

Other Decks in Marketing & SEO

Transcript

  1. 1 1

  2. 4

  3. 6 6

  4. 8 Just Create Great Content, We’ll Figure it Out He

    said it so many times, I’m not going to cite a source.
  5. 19 19 Oh, Here’s a Thread Where the Ads Team

    is Asking the Search Team to Juice Ads
  6. 20 20 Using Expired Domains Don’t Give You an Advantage

    …you can get that domain into Google; you just won’t get credit for any pre-existing links. -Matt Cutts So if the content was gone for a couple of years, probably we need to figure out what this site is, kind of essentially starting over fresh. So from that point of view I wouldn’t expect much in terms of kind of bonus because you had content there in the past. I would really assume you’re going to have to build that up again like any other site. -John Mueller
  7. 36 36 Search Engines Work based on the Vector Space

    Model Documents and queries are plotted in multidimensional vector space. The closer a document vector is to a query vector, the more relevant it is.
  8. 37 37 TF-IDF Vectors The vectors in the vector space

    model were built from TF-IDF. These were simplistic based on the Bag-of-Words model and they did not do much to encapsulate meaning.
  9. 38 38 Relevance is a Function of Cosine Similarity When

    we talk about relevance, it’s the question of similar is determined by how similar the vectors are between documents and queries. This is a quantitative measure, not the qualitative idea of how we typically think of relevance.
  10. 40 40 Very very few SEO tools are offering analysis

    that aligns with how Google works today.
  11. 42 Semantic Search is Fueled by High Density Embeddings …just

    like large language models. A lot of what Google has always been trying to do is more real.
  12. 43 43 This Allows for Mathematical Operations Comparisons of content

    and keywords become linear algebraic operations.
  13. 47 47 I Wrote About How You Can Get These

    With Screaming Frog and OpenAI Vectorize your content as you crawl. https://ipullrank.com/vector-embeddings-is-all-you-need
  14. 48 48 I Talked About How You Use Google Sheets

    to Do the Analysis Cosine Similarity is the measure of relevance: https://ipullrank.com/cosine-similarity-knn-in-google-sheets
  15. 50 50 Google Has Been Using Public About Models Since

    2020 This is why some of the search results feel so weird. A re-ranking of documents with a mix of lexical and semantic. https://arxiv.org/pdf/2010.01195.pdf
  16. 51 51 The way SEO build links doesn’t work as

    well we think it does anymore.
  17. 53 Dense Retrieval You remember “passage ranking?” This is built

    on the concept of dense retrieval wherein there are more embeddings representing more of the query and the document to uncover deeper meaning.
  18. 55 55 You need to focus on building more relevant

    links than higher volumes of links.
  19. 57 57 All of this is a huge problem because

    SEO software still operates on the lexical model.
  20. 59 Indexing is Also Harder It’s not being talked about

    as much, but indexing has gotten a lot harder since the Helpful Content update. You’ll see a lot more pages in the “Discovered - currently not indexed” and “Crawled - currently not indexed” than you did previously because the bar is higher for what Google deems worth capturing from the web.
  21. 60 60 I Believe This is a Function of Information

    Gain Conceptually, as it relates to search engines, Information Gain is the measure of how much unique information a given document adds to the ranking set of documents. In other words, what are you talking about that your competitors are not?
  22. 61 61 In conclusion: “More content” is no longer inherently

    the most effective approach because there’s no guarantee of traffic from Google.
  23. 63 63 I’m Leaving Y’all with Four Actions Today 1.

    How to Prune Your Content 2. How to Use LLMs 3. How to Appear in LLM based Search Engines 4. How to Think About Relevance
  24. 66 Aleyda Has a Process Aleyda’s workflow is a great

    place to work through whether your content should be pruned or not. https://www.aleydasolis.com/en/crawli ng-mondays/how-to-prune-your-website- content-in-an-seo-process-crawlingmon days-16th-episode/
  25. 68 68 Content Decay The web is a rapidly changing

    organism. Google always wants the most relevant content, with the best user experience, and most authority. Unless you stay on top of these measures, you will see traffic fall off over time. Measuring this content decay is as simple comparing page performance period over period in analytics or GSC. Just knowing content has decayed is not enough to be strategic.
  26. 73 73 Interpreting the Content Potential Rating 80 - 100:

    High Priority for Optimization 60 - 79: Moderate Priority for Optimization 40 - 59: Selective Optimization 20 - 39: Low Priority for Optimization 0 - 19: Minimal Benefit from Optimization If you want quick and dirty, you can prune everything below a 40 that is not driving significant traffic.
  27. 74 74 Combining CPR with pages that lost traffic helps

    you understand if it’s worth it to optimize.
  28. 75 75 Step 1. Pull the Rankings Data from Semrush

    Organic Research > Positions > Export
  29. 76 76 Step 2: Pull the Decaying Content from GSC

    Google Search Console is a great source to spot Content Decay by comparing the last three months year over year. Filter for those pages where the Click Difference is negative (smaller than 0) then export.
  30. 78 The Output is List of URLs Prioritized by Action

    Each URL is marked as Keep, Revise, Kill or Review based on the keyword opportunities available and the effort required to capitalize on them. Sorting the URLs marked as “Revise” by Aggregated SV and CPR will give you the best opportunities first.
  31. 79 79 Get your copy of the Content Pruning Workbook

    : https://ipullrank.com/cpr-sheet
  32. 80 How to Kill Content Content may be valuable for

    channels outside of Organic Search. So, killing it is about changing Google’s experience of your website to improve its relevance and reinforce its topical clusters. The best approach is to noindex the pages themselves, nofollow the links pointing to them, and submit an XML sitemap of all the pages that have changed. This will yield the quickest recrawling and reconsideration of the content.
  33. 81 81 How to Revise Content Review content across the

    topic cluster Use co-occurring keywords and entities in your content Add unique perspectives that can’t be found on other ranking pages Answer common questions Answer the People Also Ask Questions Restructure your content using headings relevant to the above Add relevant Structured markup Expand on previous explanations Add authorship Update the dates Make sure the needs of your audiences are accounted for Add to an XML sitemap of only updated pages
  34. 82 How to Review Content The sheet marks content that

    has a low content potential rating and a minimum of 500 in monthly search volume as “Review” because they may be long tail opportunities that are valuable to the business. You should take a look at the content you have for that landing page and determine if you think the effort is worthwhile.
  35. 85 85 It’s Not Difficult to Build with Llama Index

    sitemap_url = "[SITEMAP URL]" sitemap = adv.sitemap_to_df(sitemap_url) urls_to_crawl = sitemap['loc'].tolist() ... # Make an index from your documents index = VectorStoreIndex.from_documents(documents) # Setup your index for citations query_engine = CitationQueryEngine.from_args( index, # indicate how many document chunks it should return similarity_top_k=5, # here we can control how granular citation sources are, the default is 512 citation_chunk_size=155, ) response = query_engine.query("YOUR PROMPT HERE")
  36. 96 96 Queries are Longer and the Featured Snippet is

    Bigger 1. The query is more natural language and no longer Orwellian Newspeak. It can be much longer than the 32 words that is has been historically in order 2. The Featured Snippet has become the “AI snapshot” which takes 3 results and builds a summary. 3. Users can also ask follow up questions in conversational mode.
  37. 98 98 The Search Demand Curve will Shift With the

    change in the level of natural language query that Google can support, we’re going to see a lot less head terms and a lot more long tail term.
  38. 99 99 The CTR Model Will Change With the search

    results being pushed down by the AI snapshot experience, what is considered #1 will change. We should also expect that any organic result will be clicked less and the standard organic will drop dramatically. However, this will likely yield query displacement.
  39. 10 0 Rank Tracking Will Be More Complex As an

    industry, we’ll need to decide what is considered the #1 result. Based on this screenshot positions 1-3 are now the citations for the AI snapshot and #4 is below it. However, the AI snapshot loads on the client side, so rank tracking tools will need to change their approach.
  40. 10 1 10 1 Context Windows Will Yield More Personalized

    Results SGE maintains the context window of the previous search in the journey as the user goes through predefined follow questions. This will need to drive the composition of pages to ensure they remain in the consideration set for subsequent results.
  41. 10 4 10 4 Ranking in Search Generative Experience is

    more about relevance than the other signals.
  42. 10 6 10 6 Blocking LLMs is a Mistake. Appearing

    in these places will be recognized as brand awareness opportunities very soon.
  43. 10 7 10 7 What is Mitigation for AI Overviews?

    1. Manage expectations on the impact 2. Understand the keywords under threat 3. Re-prioritize your focus to keywords that are not under threat 4. Optimize the passages for the keywords you want to save
  44. 10 9 10 9 We Can Also Show You Per

    Keyword How You Show Up
  45. 11 3 11 3 Scroll to Text You can capture

    the copy used to inform the AI snapshots by scraping the Scroll to Text copy from the page.
  46. 11 4 11 4 There’s a Nearly Linear Relationship Between

    Fraggle Relevance and AI Snapshot Appearance Relevance against the chunks to keyword: Relevance against AI Snapshot:
  47. Embrace Structured Data There are three models gaining popularity: 1.

    KG-enhanced LLMs - Language Model uses KG during pre-training and inference 2. LLM-augmented KGs - LLMs do reasoning and completion on KG data 3. Synergized LLMs + KGs - Multilayer system using both at the same time https://arxiv.org/pdf/2306.08302.pdf Source: Unifying Large Language Models and Knowledge Graphs: A Roadmap
  48. 120 12 0 They Share Their Prompts in Their Code

    The GEO team also shared the ChatGPT prompts that help them improve their visibility. You can augment them and put them to work right away. https://github.com/GEO-optim/ GEO/blob/main/src/geo_functi ons.py
  49. Thank You | Q&A [email protected] Award Winning, #GirlDad Featured by

    Get Your SGE Threat Report: https://ipullrank.com/sge-report Play with Raggle: https://www.raggle.net Download the Slides: https://speakerdeck.com/ipullrank Mike King Chief Executive Officer @iPullRank