Everything Google Lied to Us About

2 2 Download this deck: https://speakerdeck.com/ipullrank

3 Salutations! I’m Mike King (@iPullRank)

5 Things Google Has Lied to Us About

7 7 Google Has Owned Everything It Would Need to
Become SkyNet

8 Just Create Great Content, We’ll Figure it Out He
said it so many times, I’m not going to cite a source.

9 9 We Don’t Understand Content, We Fake it

10 10 We Don’t Use Clicks in Rankings

12 12 We Use Clicks in Navboost aka GLUE -Pandu
Nayak

13 13 Here’s Google Telling On Itself

15 15 Y’all owe @randﬁsh an apology. Go ahead and
send it right now.

16 16 Don’t Make Machine-Generated Content For Search

17 17 Hey News People, Use this Generative AI Tool
For Your Articles

18 18 Ads and Organic Search are Separated like Church
and State

19 19 Oh, Here’s a Thread Where the Ads Team
is Asking the Search Team to Juice Ads

20 20 Using Expired Domains Don’t Give You an Advantage
…you can get that domain into Google; you just won’t get credit for any pre-existing links. -Matt Cutts So if the content was gone for a couple of years, probably we need to ﬁgure out what this site is, kind of essentially starting over fresh. So from that point of view I wouldn’t expect much in terms of kind of bonus because you had content there in the past. I would really assume you’re going to have to build that up again like any other site. -John Mueller

21 21 Then Why are Just NOW Making Expired Domains
Spam?

22 22 Google is not here for us; we are
here for Google.

23 23 Google’s Search guidelines are aspirational and we help
them the enforce them.

25 25 SEOs are Google’s unpaid workforce.

26 Things SEO Lies to Itself About

27 27 There’s A Bigger Generational Shift Happening Too..

30 30 40% of People Leaving ChatGPT Go to Google

31 31 Google Still Dwarfs Everything, but More People are
Using More Channels

32 32 It means there is fragmentation in how information
needs are being met.

33 33 Our understanding of how Google works is out
of date…

34 34 In SEO we still think Google is Here

36 36 Search Engines Work based on the Vector Space
Model Documents and queries are plotted in multidimensional vector space. The closer a document vector is to a query vector, the more relevant it is.

37 37 TF-IDF Vectors The vectors in the vector space
model were built from TF-IDF. These were simplistic based on the Bag-of-Words model and they did not do much to encapsulate meaning.

38 38 Relevance is a Function of Cosine Similarity When
we talk about relevance, it’s the question of similar is determined by how similar the vectors are between documents and queries. This is a quantitative measure, not the qualitative idea of how we typically think of relevance.

39 39 Google Shifted from Lexical to Semantic a Decade
Ago

40 40 Very very few SEO tools are offering analysis
that aligns with how Google works today.

41 41 Google Has Been More Like This Since Hummingbird

42 Semantic Search is Fueled by High Density Embeddings …just
like large language models. A lot of what Google has always been trying to do is more real.

43 43 This Allows for Mathematical Operations Comparisons of content
and keywords become linear algebraic operations.

44 44 Words are Converted to Multi-dimensional Coordinates in Vector
Space

45 45 We Went from Sparse Embeddings to Dense Embeddings

46 46 Word2Vec Gave Us Hummingbird

47 47 I Wrote About How You Can Get These
With Screaming Frog and OpenAI Vectorize your content as you crawl. https://ipullrank.com/vector-embeddings-is-all-you-need

48 48 I Talked About How You Use Google Sheets
to Do the Analysis Cosine Similarity is the measure of relevance: https://ipullrank.com/cosine-similarity-knn-in-google-sheets

49 49 Since the Introduction of BERT, Google Has Looked
More Like This

50 50 Google Has Been Using Public About Models Since
2020 This is why some of the search results feel so weird. A re-ranking of documents with a mix of lexical and semantic. https://arxiv.org/pdf/2010.01195.pdf

51 51 The way SEO build links doesn’t work as
well we think it does anymore.

52 52 Google Has Always Aspired to Leveraging Link Relevance
in Meaningful Ways

53 Dense Retrieval You remember “passage ranking?” This is built
on the concept of dense retrieval wherein there are more embeddings representing more of the query and the document to uncover deeper meaning.

54 54 Dense Retrieval is Scoring down to the Sentence
Level

55 55 You need to focus on building more relevant
links than higher volumes of links.

56 56 Under the SGE Model, Google is Structured Liked
This

57 57 All of this is a huge problem because
SEO software still operates on the lexical model.

58 58 No one is actually improving content when they
do the skyscraper technique

59 Indexing is Also Harder It’s not being talked about
as much, but indexing has gotten a lot harder since the Helpful Content update. You’ll see a lot more pages in the “Discovered - currently not indexed” and “Crawled - currently not indexed” than you did previously because the bar is higher for what Google deems worth capturing from the web.

60 60 I Believe This is a Function of Information
Gain Conceptually, as it relates to search engines, Information Gain is the measure of how much unique information a given document adds to the ranking set of documents. In other words, what are you talking about that your competitors are not?

61 61 In conclusion: “More content” is no longer inherently
the most effective approach because there’s no guarantee of traﬃc from Google.

62 62 The only content you should be making

63 63 I’m Leaving Y’all with Four Actions Today 1.
How to Prune Your Content 2. How to Use LLMs 3. How to Appear in LLM based Search Engines 4. How to Think About Relevance

64 The Content Pruning Process

65 65 Pruning and Optimization Work Quite Well Together

66 Aleyda Has a Process Aleyda’s workﬂow is a great
place to work through whether your content should be pruned or not. https://www.aleydasolis.com/en/crawli ng-mondays/how-to-prune-your-website- content-in-an-seo-process-crawlingmon days-16th-episode/

67 67 We like automate to get to a Keep.
Revise. Kill. (Review.)

68 68 Content Decay The web is a rapidly changing
organism. Google always wants the most relevant content, with the best user experience, and most authority. Unless you stay on top of these measures, you will see traﬃc fall off over time. Measuring this content decay is as simple comparing page performance period over period in analytics or GSC. Just knowing content has decayed is not enough to be strategic.

69 69 It’s not enough to know that the page
has lost traﬃc.

71 71 The Content Potential Rating (CPR).

72 72 Content Potential Score

73 73 Interpreting the Content Potential Rating 80 - 100:
High Priority for Optimization 60 - 79: Moderate Priority for Optimization 40 - 59: Selective Optimization 20 - 39: Low Priority for Optimization 0 - 19: Minimal Benefit from Optimization If you want quick and dirty, you can prune everything below a 40 that is not driving significant traffic.

74 74 Combining CPR with pages that lost traﬃc helps
you understand if it’s worth it to optimize.

75 75 Step 1. Pull the Rankings Data from Semrush
Organic Research > Positions > Export

76 76 Step 2: Pull the Decaying Content from GSC
Google Search Console is a great source to spot Content Decay by comparing the last three months year over year. Filter for those pages where the Click Difference is negative (smaller than 0) then export.

77 77 Step 3: Drop them in the Spreadsheet and
Press the Magic Button

78 The Output is List of URLs Prioritized by Action
Each URL is marked as Keep, Revise, Kill or Review based on the keyword opportunities available and the effort required to capitalize on them. Sorting the URLs marked as “Revise” by Aggregated SV and CPR will give you the best opportunities ﬁrst.

79 79 Get your copy of the Content Pruning Workbook
: https://ipullrank.com/cpr-sheet

80 How to Kill Content Content may be valuable for
channels outside of Organic Search. So, killing it is about changing Google’s experience of your website to improve its relevance and reinforce its topical clusters. The best approach is to noindex the pages themselves, nofollow the links pointing to them, and submit an XML sitemap of all the pages that have changed. This will yield the quickest recrawling and reconsideration of the content.

81 81 How to Revise Content Review content across the
topic cluster Use co-occurring keywords and entities in your content Add unique perspectives that can’t be found on other ranking pages Answer common questions Answer the People Also Ask Questions Restructure your content using headings relevant to the above Add relevant Structured markup Expand on previous explanations Add authorship Update the dates Make sure the needs of your audiences are accounted for Add to an XML sitemap of only updated pages

82 How to Review Content The sheet marks content that
has a low content potential rating and a minimum of 500 in monthly search volume as “Review” because they may be long tail opportunities that are valuable to the business. You should take a look at the content you have for that landing page and determine if you think the effort is worthwhile.

83 How to Use LLMs for Content

84 84 What is Retrieval Augmented Generation?

85 85 It’s Not Diﬃcult to Build with Llama Index
sitemap_url = "[SITEMAP URL]" sitemap = adv.sitemap_to_df(sitemap_url) urls_to_crawl = sitemap['loc'].tolist() ... # Make an index from your documents index = VectorStoreIndex.from_documents(documents) # Setup your index for citations query_engine = CitationQueryEngine.from_args( index, # indicate how many document chunks it should return similarity_top_k=5, # here we can control how granular citation sources are, the default is 512 citation_chunk_size=155, ) response = query_engine.query("YOUR PROMPT HERE")

93 Integrate Promptitude with Zapier or Make

94 94 If you’re building your own custom stuff, use
Gemma. Vertex is so expensive.

95 The threat of Google’s Search Generative Experience (SGE) aka
AI Overviews

96 96 Queries are Longer and the Featured Snippet is
Bigger 1. The query is more natural language and no longer Orwellian Newspeak. It can be much longer than the 32 words that is has been historically in order 2. The Featured Snippet has become the “AI snapshot” which takes 3 results and builds a summary. 3. Users can also ask follow up questions in conversational mode.

97 97 AI Overviews are a Function of Retrieval Augmented
Generation (RAG)

98 98 The Search Demand Curve will Shift With the
change in the level of natural language query that Google can support, we’re going to see a lot less head terms and a lot more long tail term.

99 99 The CTR Model Will Change With the search
results being pushed down by the AI snapshot experience, what is considered #1 will change. We should also expect that any organic result will be clicked less and the standard organic will drop dramatically. However, this will likely yield query displacement.

10 0 Rank Tracking Will Be More Complex As an
industry, we’ll need to decide what is considered the #1 result. Based on this screenshot positions 1-3 are now the citations for the AI snapshot and #4 is below it. However, the AI snapshot loads on the client side, so rank tracking tools will need to change their approach.

10 1 10 1 Context Windows Will Yield More Personalized
Results SGE maintains the context window of the previous search in the journey as the user goes through predeﬁned follow questions. This will need to drive the composition of pages to ensure they remain in the consideration set for subsequent results.

10 2 SGE is Susceptible to Spam and Lower Quality
Sites

10 3 10 3 Luckily Users Love it So Much

10 4 10 4 Ranking in Search Generative Experience is
more about relevance than the other signals.

10 5 How to Appear in LLMs

10 6 10 6 Blocking LLMs is a Mistake. Appearing
in these places will be recognized as brand awareness opportunities very soon.

10 7 10 7 What is Mitigation for AI Overviews?
1. Manage expectations on the impact 2. Understand the keywords under threat 3. Re-prioritize your focus to keywords that are not under threat 4. Optimize the passages for the keywords you want to save

10 8 10 8

10 9 10 9 We Can Also Show You Per
Keyword How You Show Up

11 0 11 0 It’s all about the Fraggles.

11 1 11 1

11 2 11 2 The Fraggles Show What SGE Used
for the AI Snapshot

11 3 11 3 Scroll to Text You can capture
the copy used to inform the AI snapshots by scraping the Scroll to Text copy from the page.

11 4 11 4 There’s a Nearly Linear Relationship Between
Fraggle Relevance and AI Snapshot Appearance Relevance against the chunks to keyword: Relevance against AI Snapshot:

115 11 5 Check out MarketBrew’s Free Tool to Help

Embrace Structured Data There are three models gaining popularity: 1.
KG-enhanced LLMs - Language Model uses KG during pre-training and inference 2. LLM-augmented KGs - LLMs do reasoning and completion on KG data 3. Synergized LLMs + KGs - Multilayer system using both at the same time https://arxiv.org/pdf/2306.08302.pdf Source: Unifying Large Language Models and Knowledge Graphs: A Roadmap

117 11 7

118 11 8

119 11 9

120 12 0 They Share Their Prompts in Their Code
The GEO team also shared the ChatGPT prompts that help them improve their visibility. You can augment them and put them to work right away. https://github.com/GEO-optim/ GEO/blob/main/src/geo_functi ons.py

121 12 1

12 2 Roll the Credits

12 3 12 3 Get your threat report: https://ipullrank.com/sge-report

Thank You | Q&A [email protected] Award Winning, #GirlDad Featured by
Get Your SGE Threat Report: https://ipullrank.com/sge-report Play with Raggle: https://www.raggle.net Download the Slides: https://speakerdeck.com/ipullrank Mike King Chief Executive Oﬃcer @iPullRank

Everything Google Lied to Us About

Everything Google Lied to Us About

More Decks by Michael King

Other Decks in Marketing & SEO

Featured

Transcript