$30 off During Our Annual Pro Sale. View Details »

All You Can EEAT: Navigating SEO in a Generative AI World

All You Can EEAT: Navigating SEO in a Generative AI World

Michael King

September 28, 2023
Tweet

More Decks by Michael King

Other Decks in Marketing & SEO

Transcript

  1. 1
    1

    View Slide

  2. 2
    Salutations!
    I’m Mike King
    (@iPullRank)

    View Slide

  3. 3

    View Slide

  4. View Slide

  5. 5
    5

    View Slide

  6. 6
    6
    I don’t think you came here for me to tell you
    to make great content and use real authors.

    View Slide

  7. 7
    You Can Google EEAT Best
    Practices (or just look at
    TheZebra.com)
    Write high quality content on
    subjects you actually know things
    about.
    Have an author bio and page with
    links out to other places you write
    about similar subjects and links to
    your social media.
    Make sure those sources
    highlight your expertise. Who you
    write for, where you studied,
    books you’ve written, etc.
    Get links from and give links to
    authoritative sources and people
    on similar subjects.
    Make sure the user experience on
    the sites you write on is great too.

    View Slide

  8. 8
    8
    I think you came here because you want to
    know the nature of the threats ahead.

    View Slide

  9. 9
    9
    After All, Organic Search is Likely Most of Your Traffic

    View Slide

  10. 10
    A Threat to Google is a Threat to You

    View Slide

  11. 11
    11
    TikTok Supplanted Google

    View Slide

  12. 12
    12
    ChatGPT was a code red for Google

    View Slide

  13. 13
    13
    ChatGPT is now the Star Trek
    computer Google wants to be.

    View Slide

  14. 14
    14
    Users believe Google Search quality
    is on a steep decline

    View Slide

  15. 15
    15
    The media is amplifying this idea

    View Slide

  16. 16
    16
    Google missed earnings earlier this
    year because ad sales were down

    View Slide

  17. 17
    17
    The DOJ is coming for Google and
    secrets are coming out

    View Slide

  18. 18
    This is from a post from the Google
    Cloud team discussing how their
    Search product works.
    Although Google has been
    telling on itself for years

    View Slide

  19. 19
    19
    These threats will impact you as you look to
    do content marketing and SEO

    View Slide

  20. 20
    The TikTok Threat will
    Mean More Visual Content
    Ranking
    To compete with the visual content
    channels, Google is surfacing more
    visual content in the SERPs and
    adding more features that allow users
    to get exactly where they want to go.
    This will threaten standard Organic
    positions for web content.

    View Slide

  21. 21
    21
    Short Form Video is About to Get More Competitive
    The video and image real estate in Google is going to become even more competitive since marketers recognize short
    form video as high ROI and the primary way to reach Gen Z.

    View Slide

  22. 22
    22
    Publish Your Short Form
    Video on your site
    A primary mistake that content
    marketers make is only publishing
    their short form videos on a channel
    like TikTok, Instagram, or YouTube.
    You should also publish them on your
    site using tools like Wistia and
    marking them up so they can appear
    in the SERPs.

    View Slide

  23. 23
    23
    Ad Sales Being Down Means more Ads
    What’s up with all this
    whitespace?
    What’s up with this
    featured snippet?

    View Slide

  24. 24
    24
    The real estate will get smaller, so your content
    must be that much more effective when it shows
    up in the SERPs.

    View Slide

  25. 25
    25
    The Last Time People Said
    Search Quality Was Bad we
    Got Panda and Penguin
    Panda fundamentally changed
    Organic Search. You could no longer
    create “SEO content” and rank. The
    SEO community then embraced
    content marketing knowing that it’s
    the only way forward with creating
    content that yields utility.
    Penguin did the same for links.
    Google’s Helpful Content update
    could be the new sheriff in town.

    View Slide

  26. 26
    The Threat of Generative AI

    View Slide

  27. 27
    27

    View Slide

  28. 28
    28
    47% Are Increasing their Blog Content

    View Slide

  29. 29
    29
    Marketers Have the Highest Adoption of Generative AI

    View Slide

  30. 30
    30
    There’s a Growing List of Generative AI Tools

    View Slide

  31. 31
    31
    If you’re using
    ChatGPT, you
    need AIPRM
    for prompt
    management.

    View Slide

  32. 32
    32
    If your prompt is just one sentence, don’t be
    surprised when you get garbage back.

    View Slide

  33. 33
    33
    Every tool on earth is integrating ChatGPT

    View Slide

  34. 34
    34
    Now we have AutoGPT that can do a series of
    tasks without prompts.

    View Slide

  35. 35
    35
    Doug Kessler Warned us
    Back in 2013
    Marketers are about to ramp up the
    content marketing deluge.
    https://blog.hubspot.com/blog/tabid/
    6307/bid/34080/Why-Marketers-Nee
    d-to-Rise-Above-the-Deluge-of-Crappy-
    Content.aspx

    View Slide

  36. 36
    36
    Google’s Loosened their Stance on Generated Content

    View Slide

  37. 37
    37

    View Slide

  38. 38
    I don’t believe Google can reliably
    detect LLM content.

    View Slide

  39. 39
    39
    OpenAI Can’t Even Reliably
    Detect It
    Sure, there are a variety of tools out
    there that “detect” generative AI
    content. However, they are all
    unreliable in that they can yield both
    false negatives and false positives.
    Even the people who built the best
    generative AI tools can only correctly
    detect it at 26% accuracy.

    View Slide

  40. 40
    40
    But Google can use it as a signal
    among other signals.

    View Slide

  41. 41
    41
    There are Reports that Some Sites Using Generative AI Have Been Crushed
    These are sites that don’t edit the content prior to publishing, so they deserve it.

    View Slide

  42. 42
    42
    The Helpful Content Update is Finally Showing its Teeth
    Google has been working on getting the Helpful Content classifier right. The early iterations had limited impact, but now sites
    are getting smacked left and right. We’re also seeing the threshold for crawling and indexing pages has been raised.

    View Slide

  43. 43
    43
    A Lot of “Niche” Sites are Getting Smacked

    View Slide

  44. 44
    44
    I Believe This is a Function of Information Gain
    Conceptually, as it relates to search engines, Information Gain is the measure of how much unique information a given document
    adds to the ranking set of documents. In other words, what are you talking about that your competitors are not?

    View Slide

  45. 45
    Google’s Information Gain
    Patent
    Google’s patent indicates that they are
    specifically scoring for documents that
    feature net new information over other
    documents on the same topic.

    View Slide

  46. 46
    46
    So Many People Are Just Creating Copycat Content
    WHAT GENERATIVE AI MEANS FOR GOOGLE SEARCH

    View Slide

  47. 47
    47

    View Slide

  48. 48
    48
    If you want to survive what’s coming, you’ll need
    to deliver stronger content than everyone else.

    View Slide

  49. 49
    49
    The only content you
    should be making

    View Slide

  50. 50
    50
    Confession.

    View Slide

  51. 51
    51
    I think EEAT is silly.

    View Slide

  52. 52
    52
    I call it E-TEA instead.
    I call it E-TEA instead.

    View Slide

  53. 53
    53
    How When Authorship Markup Tops out at 3%?

    View Slide

  54. 54
    54
    How When Authorship Markup Tops out at 3%?
    And this markup does not always
    specify the author!

    View Slide

  55. 55
    How I Actually Started to Believe in
    E-TEA (or How our Understanding of
    Search is out of date)

    View Slide

  56. 56
    56
    At a Base Level, This is
    What all Search Engines
    Do
    Fundamentally, this is the basis of
    how search engines function. Google
    has developed many layers on top of
    this, but this is the core of what they
    all do.

    View Slide

  57. 57
    57
    Google’s High-Level Pipeline Abstraction

    View Slide

  58. 58
    58
    We know this, but there is a single set of
    innovations that sped Google past the SEO
    community.

    View Slide

  59. 59
    59
    Lexical Search vs Semantic Search are the Two Primary Search Models
    What we as the SEO community do not have a strong enough handle on is that most of what Google’s doing is on the
    semantic side and that has all improved dramatically over the last 10 years based on machine learning.

    View Slide

  60. 60
    60
    Vector Space Model
    Documents and queries are plotted in
    multidimensional vector space. The
    closer a document vector is to a query
    vector, the more relevant it is.

    View Slide

  61. 61
    61
    Words are Converted to Multi-dimensional Coordinates in Vector Space

    View Slide

  62. 62
    62
    This Allows for
    Mathematical Operations
    Comparisons of content and
    keywords become linear algebraic
    operations.

    View Slide

  63. 63
    63
    Relevance is a Function of Cosine Similarity
    When we talk about relevance, it’s the question of similar is determined by how similar the vectors are between documents
    and queries. This is a quantitative measure, not the qualitative idea of how we typically think of relevance.

    View Slide

  64. 64
    64
    TF-IDF Vectors
    The vectors in the vector space model were built from TF-IDF. These were simplistic based on the Bag-of-Words model and
    they did not do much to encapsulate meaning.

    View Slide

  65. 65
    Word2Vec Gave Us
    Embeddings
    Word2Vec was an innovation led by
    Tomas Milosevic and Jeff Dean that
    yielded an improvement in natural
    language understanding by using
    neural networks to compute word
    vectors.
    These were better at capturing
    meaning.
    Many follow-on innovations like
    Sentence2Vec and Doc2Vec would
    follow.

    View Slide

  66. 66
    66
    We Went from Sparse Embeddings to Dense Embeddings

    View Slide

  67. 67
    67
    Word2Vec Captured Relationship, but Not Context – BERT Captures
    Context

    View Slide

  68. 68
    68
    BERT Yields Embeddings with Higher Dimensionality and Information
    Capture

    View Slide

  69. 69
    69 Source:
    https://cloud.google.com/blog/topics/developers-practitioners/find-anything-blazingly-fast-googl
    es-vector-search-technology

    View Slide

  70. 70
    70

    View Slide

  71. 71
    Dense Retrieval
    You remember “passage ranking?”
    This is built on the concept of dense
    retrieval wherein there are more
    embeddings representing more of the
    query and the document to uncover
    deeper meaning.

    View Slide

  72. 72
    72
    Dense Retrieval is Scoring down to the Sentence Level

    View Slide

  73. 73
    73
    Introducing Google’s
    Version of Dense Retrieval
    Google introduces the idea of “aspect
    embeddings” which is series of
    embeddings that represent the full
    elements of both the query and the
    document and give stronger access
    to deeper information.

    View Slide

  74. 74
    74
    Dense Representations for
    Entities
    Google has improved its entity
    resolution using embeddings giving
    them stronger access to information
    in documents.

    View Slide

  75. 75
    75
    Embeddings = Google really understands
    content relevance now.

    View Slide

  76. 76
    Website Representation
    Vectors
    Just as there are representations of
    pages as embeddings, there are
    vectors representing websites and
    Google has recently made
    improvements in understanding when
    content is not relevant to a given site.

    View Slide

  77. 77
    Author Vectors
    Similarly, Google has Author Vectors
    wherein they are able to identify an
    author and the subject matter that
    they discuss. This allows them to
    fingerprint an author and their
    expertise.

    View Slide

  78. 78
    78
    So, really E-TEA is a function of information
    associated with vector representations of
    websites and authors.

    View Slide

  79. 79
    79
    As a content marketer, you need to treat your
    byline like the asset that it is.

    View Slide

  80. 80
    80
    Also, relevance isn’t qualitative to Google.

    View Slide

  81. 81
    81
    Embeddings keep getting better at capturing
    meaning while SEO tools still operate on the
    Lexical Search model

    View Slide

  82. 82
    82

    View Slide

  83. 83
    83
    I Feel Like My Page is More Relevant to [Enterprise SEO]

    View Slide

  84. 84
    84
    Relevance isn’t Qualitative to Google.

    View Slide

  85. 85
    85

    View Slide

  86. 86
    86

    View Slide

  87. 87
    87
    See! My page is more
    relevant, but it’s not
    ranking as well.

    View Slide

  88. 88
    88
    https://ipullrank.com/tools/orbitwise

    View Slide

  89. 89
    The threat of Google’s Search
    Generative Experience (SGE)

    View Slide

  90. 90
    At I/O Google Announced a
    Dramatic Change to
    Search
    The experimental “Search Generative
    Experience” brings generative AI to
    the SERPs and significantly changes
    Google’s UX.

    View Slide

  91. 91
    91
    Queries are Longer and the
    Featured Snippet is Bigger
    1. The query is more natural language
    and no longer Orwellian Newspeak. It
    can be much longer than the 32
    words that is has been historically in
    order
    2. The Featured Snippet has become
    the “AI snapshot” which takes 3
    results and builds a summary.
    3. Users can also ask follow up
    questions in conversational mode.
    3
    2
    1

    View Slide

  92. 92
    92
    Sundar is All In.
    In Sundar’s recent press run he keeps
    saying how Google will be doubling
    down on SGE. So it’s going to be a thing
    moving forward.

    View Slide

  93. 93
    The Search Demand Curve
    will Shift
    With the change in the level of natural
    language query that Google can
    support, we’re going to see a lot less
    head terms and a lot more long tail
    term.
    Going
    down
    Going
    up

    View Slide

  94. 94
    94
    The CTR Model Will
    Change
    With the search results being pushed
    down by the AI snapshot experience,
    what is considered #1 will change.
    We should also expect that any
    organic result will be clicked less and
    the standard organic will drop
    dramatically.
    However, this will likely yield query
    displacement.

    View Slide

  95. 95
    Rank Tracking Will Be
    More Complex
    As an industry, we’ll need to decide
    what is considered the #1 result.
    Based on this screenshot positions
    1-3 are now the citations for the AI
    snapshot and #4 is below it.
    However, the AI snapshot loads on
    the client side, so rank tracking tools
    will need to change their approach.

    View Slide

  96. 96
    96
    Context Windows Will
    Yield More Personalized
    Results
    SGE maintains the context window of
    the previous search in the journey as
    the user goes through predefined
    follow questions.
    This will need to drive the
    composition of pages to ensure they
    remain in the consideration set for
    subsequent results.

    View Slide

  97. 97
    97
    Ask to Trigger.

    View Slide

  98. 98
    98
    Auto-Trigger.

    View Slide

  99. 99
    99
    We’ve seen this take up to
    30 seconds to generate.
    Although, it’s a lot faster
    now.

    View Slide

  100. 10
    0
    10
    0
    HELLO EMPTY
    REAL ESTATE!
    (aka more ad clicks)

    View Slide

  101. 10
    1
    10
    1

    View Slide

  102. 10
    2
    10
    2

    View Slide

  103. 10
    3
    10
    3

    View Slide

  104. 10
    4
    10
    4
    It’s an “experiment” so we don’t know much,
    but here’s what we can infer.

    View Slide

  105. 10
    5
    10
    5

    View Slide

  106. 10
    6
    10
    6
    Most autoloading AI Snapshots take 3-30
    seconds to load.

    View Slide

  107. 10
    7
    10
    7
    We Know that Featured Snippets Take 35.1% of Clicks

    View Slide

  108. 10
    8
    10
    8
    Using All that Information We’re Modeling the Threat of SGE

    View Slide

  109. 10
    9
    10
    9
    Get your threat report:
    https://ipullrank.com/sge-report

    View Slide

  110. 11
    0
    What is Retrieval Augmented
    Generation (RAG)?

    View Slide

  111. 11
    1
    11
    1
    This is Called “Retrieval Augmented Generation”
    Neeva (RIP), Bing, and now Google’s Search Generative Experience all use pull documents based on search queries and
    feed them to a language model to generate a response.

    View Slide

  112. 11
    2
    11
    2
    Google’s Version of this is called Retrieval-Augmented Language Model
    Pre-Training (REALM) from 2021

    View Slide

  113. 11
    3
    11
    3
    SGE is built from REALM + PaLM 2 and MUM
    MUM is the Multitask Unified Model that Google announced in 2021 as way to do retrieval augmented generation. PaLM 2
    is their latest state of the art large language model.

    View Slide

  114. 11
    4
    11
    4
    If You Want More
    Technical Detail Check Out
    This Paper
    https://arxiv.org/pdf/2002.08909.pdf

    View Slide

  115. 11
    5
    11
    5
    Search Engines Are Now
    OK with Not Being Right
    They evaluate Bing Chat, NeevaAI,
    http://perplexity.ai & YouChat—only 52%
    of statements are supported by citations
    and 75% of citations actually support
    their statements.
    https://arxiv.org/abs/2304.09848

    View Slide

  116. 11
    6
    11
    6
    Sounds cool, but how does it work?

    View Slide

  117. 11
    7
    11
    7

    View Slide

  118. 11
    8
    11
    8
    It’s so easy that I built a proof of concept

    View Slide

  119. 11
    9
    11
    9

    View Slide

  120. 12
    0
    12
    0

    View Slide

  121. 12
    1
    12
    1
    AvesAPI + Llama Index + ChatGPT = Raggle
    Rankings
    data
    Vector index
    & operations
    Clearly you
    know what
    this does.

    View Slide

  122. 12
    2
    12
    2
    It’s pretty simple
    # Make an index from your documents
    index = VectorStoreIndex.from_documents(documents)
    # Setup your index for citations
    query_engine = CitationQueryEngine.from_args(
    index,
    # indicate how many document chunks it should return
    similarity_top_k=5,
    # here we can control how granular citation sources are, the default is 512
    citation_chunk_size=155,
    )
    response = query_engine.query("Answer the following query in 150 words: " + query)

    View Slide

  123. 12
    3
    12
    3
    Limitations of my POC
    It doesn’t do follow up questions It’s not responsive It only does the informational snippet

    View Slide

  124. 12
    4
    12
    4
    You can play with it at raggle.net

    View Slide

  125. 12
    5
    Optimizing for SGE?

    View Slide

  126. 12
    6
    Dense Retrieval
    You remember “passage ranking?”
    This is built on the concept of dense
    retrieval wherein there are more
    embeddings representing more of the
    query and the document to uncover
    deeper meaning.

    View Slide

  127. 12
    7
    12
    7
    Dense Retrieval is Scoring down to the Sentence Level

    View Slide

  128. 12
    8
    12
    8
    It’s all about the chunks. So use Llama Index to
    determine the your chunks and improve the
    similarity to the query.

    View Slide

  129. 12
    9
    12
    9
    I’ve Added A Chunk Explorer so You Can See Which Text was Used

    View Slide

  130. 13
    0
    The Content Opportunity of RAG

    View Slide

  131. There’s a Lot of
    Synergy Between
    KGs and LLMs
    There are three models gaining
    popularity:
    1. KG-enhanced LLMs -
    Language Model uses KG
    during pre-training and
    inference
    2. LLM-augmented KGs - LLMs
    do reasoning and completion
    on KG data
    3. Synergized LLMs + KGs -
    Multilayer system using both
    at the same time
    https://arxiv.org/pdf/2306.08302.pdf
    Source: Unifying Large Language Models and Knowledge Graphs: A Roadmap

    View Slide

  132. Organizations
    are doing RAG
    with Knowledge
    Graphs
    ● Anyone can feed their
    data into an LLM as a
    fine-tuning measure to
    improve the output.
    ● People are currently
    using their knowledge
    graphs to support this.

    View Slide

  133. 13
    3
    13
    3
    The code is not much different
    sitemap_url = "[SITEMAP URL]"
    sitemap = adv.sitemap_to_df(sitemap_url)
    urls_to_crawl = sitemap['loc'].tolist()
    ...
    # Make an index from your documents
    index = VectorStoreIndex.from_documents(documents)
    # Setup your index for citations
    query_engine = CitationQueryEngine.from_args(
    index,
    # indicate how many document chunks it should return
    similarity_top_k=5,
    # here we can control how granular citation sources are, the default is 512
    citation_chunk_size=155,
    )
    response = query_engine.query("YOUR PROMPT HERE")

    View Slide

  134. Fact
    Verification
    ● Although Google has
    historically said they do
    not verification of facts.
    ● LLM + KG integrations
    make this a possibility
    and Google needs to
    combat the wealth of
    content being produced
    with LLMs. So, it’s likely
    they will use this
    functionality.
    Source: Fact Checking in Knowledge
    Graphs by Logical Consistency
    Source: FactKG: Fact Verification via Reasoning on Knowledge Graphs

    View Slide

  135. 13
    5
    Brands are Using Generative AI as a Force Multiplier
    ● 52% of business leaders are currently using
    AI content generation tools to assist their
    content marketing efforts.
    ● 64.7% of business leaders plan to use AI
    content generation tools to assist their
    content marketing efforts in 2023.
    Major brands are using tools like ChatGPT and Midjourney to scale their content marketing efforts. The brands that don’t
    leverage these tools are quickly falling behind.
    Source: Siege Media + Clearscope

    View Slide

  136. 13
    6
    13
    6
    Sadly, Everyone is Using it and No One Has a Strategy

    View Slide

  137. 13
    7
    But… Brands Still Need Content
    Strategy to Capitalize On It
    Individuals are using tools like ChatGPT
    in isolation, but for an organization to
    capitalize on it there needs to be a
    generative AI content strategy that
    encourages governance and
    consistency of the content created.

    View Slide

  138. 138
    13
    8
    The Three Laws of Generative AI content
    1. Generative AI is not the end-all-be-all solution.
    It is not the replacement for a content strategy or
    your content team.
    2. Generative AI for content creation should be a force
    multiplier to be utilized to improve workflow and
    augment strategy.
    3. You should consider generative AI content for
    awareness efforts, but continue to leverage subject
    matter experts for lower funnel content.
    GENERATIVE AI OPPORTUNITIES & THREATS

    View Slide

  139. 13
    9
    13
    9
    How We’re Helping Brands Capitalize on Generative AI
    Leveraging our extensive enterprise
    Content Strategy experience, we
    take an 8-step approach to make
    generative AI tools learn to speak in
    your brand voice and we build out
    solutions to bake the functionality
    into your toolkit.
    We take a deep
    dive into how your
    Content Strategy
    currently operates
    to replicate and
    expand on it
    through AI.
    We look for places
    in your existing
    processes and
    tools to integrate
    AI functionality.
    We build out the
    content models,
    workflows,
    governance
    models, and toolkit
    for generative AI.
    We develop a
    library of prompts
    to be used across
    your organization
    for various content
    use cases.
    We run the
    prompts through a
    series of QA tests
    to ensure that
    content is always
    generated as
    expected.
    We improve
    prompts that do
    not pass our QA
    tests. We deliver the
    prompts and
    training on how to
    use the new
    content systems.
    We update and
    optimize prompts
    as generative AI
    tools update and
    emerge.
    Strategic Planning
    We tailor our
    approach to your
    goals and existing
    content strategy.
    Generative AI
    Delivery
    We deliver vetted
    prompts and train
    your team on
    generative AI
    systems.
    Review Client
    Goals and
    Content
    Strategy
    Identify AI
    integration
    points
    Prepare
    Generative AI
    Content Plan
    Output QA
    Build Prompt
    Library
    Optimize
    Outputs
    Knowledge
    Transfer
    Maintenance
    OUR GENERATIVE AI PROCESS

    View Slide

  140. 14
    0
    14
    0
    Don’t forget that ChatGPT is very much an
    unfinished product.

    View Slide

  141. Custom Index Functionality Coming
    Very Soon

    View Slide

  142. 14
    2
    Roll the Credits

    View Slide

  143. 14
    3
    14
    3
    Mike, that was a lot. What should I be doing?
    Write with Information Gain in mind
    Keep an eye on threats in the SERPs
    Use structured data wherever possible
    Use tools to understand how relevant Google thinks your content is
    Build an actual content strategy around generative AI
    Build a prompt library
    Build custom indexes for stronger generative AI content creation
    Treat your byline as the asset that it is
    By ready for search behavior to change
    Optimize the chunks

    View Slide

  144. 14
    4

    View Slide

  145. 145
    14
    5
    We’ve Been Using GPT Tech Since 2020
    GENERATIVE AI OPPORTUNITIES & THREATS

    View Slide

  146. 14
    6
    14
    6
    Get your threat report:
    https://ipullrank.com/sge-report

    View Slide

  147. Custom Index Functionality Coming
    Very Soon

    View Slide

  148. View Slide

  149. Mike King
    Founder / CEO
    @iPullRank
    Thank You | Q&A
    [email protected]
    Award Winning, #GirlDad
    Featured by
    Get Your SGE Threat Report:
    https://ipullrank.com/sge-report
    Play with Raggle:
    https://www.raggle.net

    View Slide