Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JR Oakes - NPath: Leveraging LLMs to Extract In...

Tech SEO Connect
October 23, 2024
1

JR Oakes - NPath: Leveraging LLMs to Extract Insights from GA4 Event Data

Tech SEO Connect

October 23, 2024
Tweet

Transcript

  1. JR Oakes 2 JR Oakes is the VP of Strategy

    for LOCOMOTIVE Agency. He has been an SEO since 2011 and was formerly an architectural glass artist. His focus areas are in SEO, machine learning, language, and user experience. jroakes locomotive.agency
  2. CONFIDENTIAL My Interests Have Broadened a Bit 3 Causal ML

    Generative AI Personalization Engines Technical SEO
  3. CONFIDENTIAL Top-10 Best AI Writing Tools* * According to ChatGPT

    models 11 Shocking how consistent this is
  4. CONFIDENTIAL Reasons LLM Brand Tracking is Messy 12 Temperature Memory/Projects

    Random Seeds RAG Search Slight differences in wording Different Models
  5. CONFIDENTIAL Causal Reasoning and Large Language Models “…we find that

    LLMs can generate text corresponding to correct causal arguments with high probability, surpassing the best-performing existing methodsˮ 20 Source
  6. CONFIDENTIAL ▪ LLMs can generate causal graphs from metadata alone

    ▪ Capable of counterfactual reasoning in natural language ▪ Determine necessary/sufficient causes from text descriptions ▪ Capture relevant context and common sense for causal judgments ▪ Complement existing statistical causal methods ▪ Automate parts of causal analysis previously requiring human experts ▪ Enable flexible, natural language interaction for causal tasks 21 Source Causal Reasoning and Large Language Models
  7. CONFIDENTIAL 23 This Past Year We Have Been Building.. ▪

    Built a tool called Npath, that turns GA4 sequence data into user cohorts, ideals paths, and insights. ▪ Built a tool that takes a companyʼs ICP information and turns it into tens of thousands of categorized keywords with competitive metrics. ▪ Formalized all competitive gaps to aggregate data based on subjects rather than keywords. ▪ Finalizing a tool to detect anomalies buried in thousands of URLs and segments of URLs.
  8. CONFIDENTIAL Can we just throw URL page data into a

    big bucket each month and have LLMs compare and share insights on it? 25
  9. Gemini 29 - 2M Tokens - Schema specification - MIME

    type specification (e.g. application/json)
  10. Tech Debt 30 You kinda get into dependency hell with

    LLMs. Will it work if they upgrade the model?
  11. CONFIDENTIAL 32 Gemini API Costs Input Pricing $1.25 / 1

    million tokens Output Pricing $5.00 / 1 million tokens 1M tokens = ~130 pages 1M tokens = ~700 pages
  12. CONFIDENTIAL 34 Models Will Get Better The pace of improvement

    compared with human-level performance continues to expand across a wide-range of tasks. Source
  13. CONFIDENTIAL 36 It’s Yours, Change it This is open source,

    so tweak the prompts your your needs.
  14. We Have to Be a Bit Creative 39 Just provide

    current and prior values. We calculate the rest.
  15. CONFIDENTIAL 41 Metrics Content Analysis: • Clean Content • Metadata

    • Word Count • Heading Structure • Image Count • Schema Markup Page Speed Insights: • Largest Contentful Paint • Cumulative Layout Shift • Interaction to Next Paint • First Contentful Paint • Time to Interactive • Speed Index • Total Blocking Time • Performance Score • First Meaningful Paint Search Engine Performance: • Clicks (from Google Search Console) • Impressions (from Google Search Console) • Click-Through Rate CTR • Average Position (in search results) • Ranking Keywords • Top No-Click Queries
  16. CONFIDENTIAL 42 Metrics Traffic and User Behavior: • Organic Sessions

    • Organic Users • Organic New Users • Bounce Rate • Average Time on Page • Engagement Rate • Revenue • User Demographics • Device Categories • Pages Visited Prior • Pages Visited Next • Referring Sites
  17. Requirements 46 1. A developer :-) 2. A few API

    keys (most are free or very cheap)