Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All You Can EEAT: Navigating SEO in a Generativ...

All You Can EEAT: Navigating SEO in a Generative AI World

Michael King

September 28, 2023
Tweet

More Decks by Michael King

Other Decks in Marketing & SEO

Transcript

  1. 1 1

  2. 3

  3. 5 5

  4. 6 6 I don’t think you came here for me

    to tell you to make great content and use real authors.
  5. 7 You Can Google EEAT Best Practices (or just look

    at TheZebra.com) Write high quality content on subjects you actually know things about. Have an author bio and page with links out to other places you write about similar subjects and links to your social media. Make sure those sources highlight your expertise. Who you write for, where you studied, books you’ve written, etc. Get links from and give links to authoritative sources and people on similar subjects. Make sure the user experience on the sites you write on is great too.
  6. 8 8 I think you came here because you want

    to know the nature of the threats ahead.
  7. 18 This is from a post from the Google Cloud

    team discussing how their Search product works. Although Google has been telling on itself for years
  8. 19 19 These threats will impact you as you look

    to do content marketing and SEO
  9. 20 The TikTok Threat will Mean More Visual Content Ranking

    To compete with the visual content channels, Google is surfacing more visual content in the SERPs and adding more features that allow users to get exactly where they want to go. This will threaten standard Organic positions for web content.
  10. 21 21 Short Form Video is About to Get More

    Competitive The video and image real estate in Google is going to become even more competitive since marketers recognize short form video as high ROI and the primary way to reach Gen Z.
  11. 22 22 Publish Your Short Form Video on your site

    A primary mistake that content marketers make is only publishing their short form videos on a channel like TikTok, Instagram, or YouTube. You should also publish them on your site using tools like Wistia and marking them up so they can appear in the SERPs.
  12. 23 23 Ad Sales Being Down Means more Ads What’s

    up with all this whitespace? What’s up with this featured snippet?
  13. 24 24 The real estate will get smaller, so your

    content must be that much more effective when it shows up in the SERPs.
  14. 25 25 The Last Time People Said Search Quality Was

    Bad we Got Panda and Penguin Panda fundamentally changed Organic Search. You could no longer create “SEO content” and rank. The SEO community then embraced content marketing knowing that it’s the only way forward with creating content that yields utility. Penguin did the same for links. Google’s Helpful Content update could be the new sheriff in town.
  15. 32 32 If your prompt is just one sentence, don’t

    be surprised when you get garbage back.
  16. 34 34 Now we have AutoGPT that can do a

    series of tasks without prompts.
  17. 35 35 Doug Kessler Warned us Back in 2013 Marketers

    are about to ramp up the content marketing deluge. https://blog.hubspot.com/blog/tabid/ 6307/bid/34080/Why-Marketers-Nee d-to-Rise-Above-the-Deluge-of-Crappy- Content.aspx
  18. 39 39 OpenAI Can’t Even Reliably Detect It Sure, there

    are a variety of tools out there that “detect” generative AI content. However, they are all unreliable in that they can yield both false negatives and false positives. Even the people who built the best generative AI tools can only correctly detect it at 26% accuracy.
  19. 41 41 There are Reports that Some Sites Using Generative

    AI Have Been Crushed These are sites that don’t edit the content prior to publishing, so they deserve it.
  20. 42 42 The Helpful Content Update is Finally Showing its

    Teeth Google has been working on getting the Helpful Content classifier right. The early iterations had limited impact, but now sites are getting smacked left and right. We’re also seeing the threshold for crawling and indexing pages has been raised.
  21. 44 44 I Believe This is a Function of Information

    Gain Conceptually, as it relates to search engines, Information Gain is the measure of how much unique information a given document adds to the ranking set of documents. In other words, what are you talking about that your competitors are not?
  22. 45 Google’s Information Gain Patent Google’s patent indicates that they

    are specifically scoring for documents that feature net new information over other documents on the same topic.
  23. 46 46 So Many People Are Just Creating Copycat Content

    WHAT GENERATIVE AI MEANS FOR GOOGLE SEARCH
  24. 48 48 If you want to survive what’s coming, you’ll

    need to deliver stronger content than everyone else.
  25. 54 54 How When Authorship Markup Tops out at 3%?

    And this markup does not always specify the author!
  26. 55 How I Actually Started to Believe in E-TEA (or

    How our Understanding of Search is out of date)
  27. 56 56 At a Base Level, This is What all

    Search Engines Do Fundamentally, this is the basis of how search engines function. Google has developed many layers on top of this, but this is the core of what they all do.
  28. 58 58 We know this, but there is a single

    set of innovations that sped Google past the SEO community.
  29. 59 59 Lexical Search vs Semantic Search are the Two

    Primary Search Models What we as the SEO community do not have a strong enough handle on is that most of what Google’s doing is on the semantic side and that has all improved dramatically over the last 10 years based on machine learning.
  30. 60 60 Vector Space Model Documents and queries are plotted

    in multidimensional vector space. The closer a document vector is to a query vector, the more relevant it is.
  31. 62 62 This Allows for Mathematical Operations Comparisons of content

    and keywords become linear algebraic operations.
  32. 63 63 Relevance is a Function of Cosine Similarity When

    we talk about relevance, it’s the question of similar is determined by how similar the vectors are between documents and queries. This is a quantitative measure, not the qualitative idea of how we typically think of relevance.
  33. 64 64 TF-IDF Vectors The vectors in the vector space

    model were built from TF-IDF. These were simplistic based on the Bag-of-Words model and they did not do much to encapsulate meaning.
  34. 65 Word2Vec Gave Us Embeddings Word2Vec was an innovation led

    by Tomas Milosevic and Jeff Dean that yielded an improvement in natural language understanding by using neural networks to compute word vectors. These were better at capturing meaning. Many follow-on innovations like Sentence2Vec and Doc2Vec would follow.
  35. 71 Dense Retrieval You remember “passage ranking?” This is built

    on the concept of dense retrieval wherein there are more embeddings representing more of the query and the document to uncover deeper meaning.
  36. 73 73 Introducing Google’s Version of Dense Retrieval Google introduces

    the idea of “aspect embeddings” which is series of embeddings that represent the full elements of both the query and the document and give stronger access to deeper information.
  37. 74 74 Dense Representations for Entities Google has improved its

    entity resolution using embeddings giving them stronger access to information in documents.
  38. 76 Website Representation Vectors Just as there are representations of

    pages as embeddings, there are vectors representing websites and Google has recently made improvements in understanding when content is not relevant to a given site.
  39. 77 Author Vectors Similarly, Google has Author Vectors wherein they

    are able to identify an author and the subject matter that they discuss. This allows them to fingerprint an author and their expertise.
  40. 78 78 So, really E-TEA is a function of information

    associated with vector representations of websites and authors.
  41. 79 79 As a content marketer, you need to treat

    your byline like the asset that it is.
  42. 81 81 Embeddings keep getting better at capturing meaning while

    SEO tools still operate on the Lexical Search model
  43. 90 At I/O Google Announced a Dramatic Change to Search

    The experimental “Search Generative Experience” brings generative AI to the SERPs and significantly changes Google’s UX.
  44. 91 91 Queries are Longer and the Featured Snippet is

    Bigger 1. The query is more natural language and no longer Orwellian Newspeak. It can be much longer than the 32 words that is has been historically in order 2. The Featured Snippet has become the “AI snapshot” which takes 3 results and builds a summary. 3. Users can also ask follow up questions in conversational mode. 3 2 1
  45. 92 92 Sundar is All In. In Sundar’s recent press

    run he keeps saying how Google will be doubling down on SGE. So it’s going to be a thing moving forward.
  46. 93 The Search Demand Curve will Shift With the change

    in the level of natural language query that Google can support, we’re going to see a lot less head terms and a lot more long tail term. Going down Going up
  47. 94 94 The CTR Model Will Change With the search

    results being pushed down by the AI snapshot experience, what is considered #1 will change. We should also expect that any organic result will be clicked less and the standard organic will drop dramatically. However, this will likely yield query displacement.
  48. 95 Rank Tracking Will Be More Complex As an industry,

    we’ll need to decide what is considered the #1 result. Based on this screenshot positions 1-3 are now the citations for the AI snapshot and #4 is below it. However, the AI snapshot loads on the client side, so rank tracking tools will need to change their approach.
  49. 96 96 Context Windows Will Yield More Personalized Results SGE

    maintains the context window of the previous search in the journey as the user goes through predefined follow questions. This will need to drive the composition of pages to ensure they remain in the consideration set for subsequent results.
  50. 99 99 We’ve seen this take up to 30 seconds

    to generate. Although, it’s a lot faster now.
  51. 10 4 10 4 It’s an “experiment” so we don’t

    know much, but here’s what we can infer.
  52. 11 1 11 1 This is Called “Retrieval Augmented Generation”

    Neeva (RIP), Bing, and now Google’s Search Generative Experience all use pull documents based on search queries and feed them to a language model to generate a response.
  53. 11 2 11 2 Google’s Version of this is called

    Retrieval-Augmented Language Model Pre-Training (REALM) from 2021
  54. 11 3 11 3 SGE is built from REALM +

    PaLM 2 and MUM MUM is the Multitask Unified Model that Google announced in 2021 as way to do retrieval augmented generation. PaLM 2 is their latest state of the art large language model.
  55. 11 4 11 4 If You Want More Technical Detail

    Check Out This Paper https://arxiv.org/pdf/2002.08909.pdf
  56. 11 5 11 5 Search Engines Are Now OK with

    Not Being Right They evaluate Bing Chat, NeevaAI, http://perplexity.ai & YouChat—only 52% of statements are supported by citations and 75% of citations actually support their statements. https://arxiv.org/abs/2304.09848
  57. 12 1 12 1 AvesAPI + Llama Index + ChatGPT

    = Raggle Rankings data Vector index & operations Clearly you know what this does.
  58. 12 2 12 2 It’s pretty simple # Make an

    index from your documents index = VectorStoreIndex.from_documents(documents) # Setup your index for citations query_engine = CitationQueryEngine.from_args( index, # indicate how many document chunks it should return similarity_top_k=5, # here we can control how granular citation sources are, the default is 512 citation_chunk_size=155, ) response = query_engine.query("Answer the following query in 150 words: " + query)
  59. 12 3 12 3 Limitations of my POC It doesn’t

    do follow up questions It’s not responsive It only does the informational snippet
  60. 12 6 Dense Retrieval You remember “passage ranking?” This is

    built on the concept of dense retrieval wherein there are more embeddings representing more of the query and the document to uncover deeper meaning.
  61. 12 8 12 8 It’s all about the chunks. So

    use Llama Index to determine the your chunks and improve the similarity to the query.
  62. 12 9 12 9 I’ve Added A Chunk Explorer so

    You Can See Which Text was Used
  63. There’s a Lot of Synergy Between KGs and LLMs There

    are three models gaining popularity: 1. KG-enhanced LLMs - Language Model uses KG during pre-training and inference 2. LLM-augmented KGs - LLMs do reasoning and completion on KG data 3. Synergized LLMs + KGs - Multilayer system using both at the same time https://arxiv.org/pdf/2306.08302.pdf Source: Unifying Large Language Models and Knowledge Graphs: A Roadmap
  64. Organizations are doing RAG with Knowledge Graphs • Anyone can

    feed their data into an LLM as a fine-tuning measure to improve the output. • People are currently using their knowledge graphs to support this.
  65. 13 3 13 3 The code is not much different

    sitemap_url = "[SITEMAP URL]" sitemap = adv.sitemap_to_df(sitemap_url) urls_to_crawl = sitemap['loc'].tolist() ... # Make an index from your documents index = VectorStoreIndex.from_documents(documents) # Setup your index for citations query_engine = CitationQueryEngine.from_args( index, # indicate how many document chunks it should return similarity_top_k=5, # here we can control how granular citation sources are, the default is 512 citation_chunk_size=155, ) response = query_engine.query("YOUR PROMPT HERE")
  66. Fact Verification • Although Google has historically said they do

    not verification of facts. • LLM + KG integrations make this a possibility and Google needs to combat the wealth of content being produced with LLMs. So, it’s likely they will use this functionality. Source: Fact Checking in Knowledge Graphs by Logical Consistency Source: FactKG: Fact Verification via Reasoning on Knowledge Graphs
  67. 13 5 Brands are Using Generative AI as a Force

    Multiplier • 52% of business leaders are currently using AI content generation tools to assist their content marketing efforts. • 64.7% of business leaders plan to use AI content generation tools to assist their content marketing efforts in 2023. Major brands are using tools like ChatGPT and Midjourney to scale their content marketing efforts. The brands that don’t leverage these tools are quickly falling behind. Source: Siege Media + Clearscope
  68. 13 7 But… Brands Still Need Content Strategy to Capitalize

    On It Individuals are using tools like ChatGPT in isolation, but for an organization to capitalize on it there needs to be a generative AI content strategy that encourages governance and consistency of the content created.
  69. 138 13 8 The Three Laws of Generative AI content

    1. Generative AI is not the end-all-be-all solution. It is not the replacement for a content strategy or your content team. 2. Generative AI for content creation should be a force multiplier to be utilized to improve workflow and augment strategy. 3. You should consider generative AI content for awareness efforts, but continue to leverage subject matter experts for lower funnel content. GENERATIVE AI OPPORTUNITIES & THREATS
  70. 13 9 13 9 How We’re Helping Brands Capitalize on

    Generative AI Leveraging our extensive enterprise Content Strategy experience, we take an 8-step approach to make generative AI tools learn to speak in your brand voice and we build out solutions to bake the functionality into your toolkit. We take a deep dive into how your Content Strategy currently operates to replicate and expand on it through AI. We look for places in your existing processes and tools to integrate AI functionality. We build out the content models, workflows, governance models, and toolkit for generative AI. We develop a library of prompts to be used across your organization for various content use cases. We run the prompts through a series of QA tests to ensure that content is always generated as expected. We improve prompts that do not pass our QA tests. We deliver the prompts and training on how to use the new content systems. We update and optimize prompts as generative AI tools update and emerge. Strategic Planning We tailor our approach to your goals and existing content strategy. Generative AI Delivery We deliver vetted prompts and train your team on generative AI systems. Review Client Goals and Content Strategy Identify AI integration points Prepare Generative AI Content Plan Output QA Build Prompt Library Optimize Outputs Knowledge Transfer Maintenance OUR GENERATIVE AI PROCESS
  71. 14 0 14 0 Don’t forget that ChatGPT is very

    much an unfinished product.
  72. 14 3 14 3 Mike, that was a lot. What

    should I be doing? Write with Information Gain in mind Keep an eye on threats in the SERPs Use structured data wherever possible Use tools to understand how relevant Google thinks your content is Build an actual content strategy around generative AI Build a prompt library Build custom indexes for stronger generative AI content creation Treat your byline as the asset that it is By ready for search behavior to change Optimize the chunks
  73. 145 14 5 We’ve Been Using GPT Tech Since 2020

    GENERATIVE AI OPPORTUNITIES & THREATS
  74. Mike King Founder / CEO @iPullRank Thank You | Q&A

    [email protected] Award Winning, #GirlDad Featured by Get Your SGE Threat Report: https://ipullrank.com/sge-report Play with Raggle: https://www.raggle.net