JR Oakes - Leveraging LLMs To Extract Insights From Analytics Data

CONFIDENTIAL Context is the Analytics King? Tech SEO Connect 1

JR Oakes 2 JR Oakes is the VP of Strategy
for LOCOMOTIVE Agency. He has been an SEO since 2011 and was formerly an architectural glass artist. His focus areas are in SEO, machine learning, language, and user experience. jroakes locomotive.agency

CONFIDENTIAL My Interests Have Broadened a Bit 3 Causal ML
Generative AI Personalization Engines Technical SEO

CONFIDENTIAL CONFIDENTIAL Interestingness 4 Section 2

CONFIDENTIAL Efficient Infinite Context Transformers with Infini-attention 5 Source

CONFIDENTIAL Mixed Bag 6

CONFIDENTIAL Better Model 7 Brand Expertise Trust Transparency Engagement Real
Experience

CONFIDENTIAL LLMs Love the Word Delve 8 Source

CONFIDENTIAL 9 Source

CONFIDENTIAL Top-10 Best SEO Agencies* * According to ChatGPT models
10

CONFIDENTIAL Top-10 Best AI Writing Tools* * According to ChatGPT
models 11 Shocking how consistent this is

CONFIDENTIAL Reasons LLM Brand Tracking is Messy 12 Temperature Memory/Projects
Random Seeds RAG Search Slight differences in wording Different Models

CONFIDENTIAL LibreChat 13

CONFIDENTIAL CONFIDENTIAL Trends 14 Section 1

CONFIDENTIAL Watching Emerging Trends 15

CONFIDENTIAL 16 People are looking at the problems with AI,
but not the possibilities.

CONFIDENTIAL 17 OR

CONFIDENTIAL 18 MORE AI CONTENT

CONFIDENTIAL 19 AI Everyone

CONFIDENTIAL Causal Reasoning and Large Language Models “…we find that
LLMs can generate text corresponding to correct causal arguments with high probability, surpassing the best-performing existing methodsˮ 20 Source

CONFIDENTIAL ▪ LLMs can generate causal graphs from metadata alone
▪ Capable of counterfactual reasoning in natural language ▪ Determine necessary/sufficient causes from text descriptions ▪ Capture relevant context and common sense for causal judgments ▪ Complement existing statistical causal methods ▪ Automate parts of causal analysis previously requiring human experts ▪ Enable flexible, natural language interaction for causal tasks 21 Source Causal Reasoning and Large Language Models

CONFIDENTIAL CONFIDENTIAL We Code 22 Section 3

CONFIDENTIAL 23 This Past Year We Have Been Building.. ▪
Built a tool called Npath, that turns GA4 sequence data into user cohorts, ideals paths, and insights. ▪ Built a tool that takes a companyʼs ICP information and turns it into tens of thousands of categorized keywords with competitive metrics. ▪ Formalized all competitive gaps to aggregate data based on subjects rather than keywords. ▪ Finalizing a tool to detect anomalies buried in thousands of URLs and segments of URLs.

CONFIDENTIAL 24 I had a thought

CONFIDENTIAL Can we just throw URL page data into a
big bucket each month and have LLMs compare and share insights on it? 25

CONFIDENTIAL 26 A Cloud Robot Looking for Errors in My
URLs

CONFIDENTIAL 27 We Need to Put Some Pieces Together

CONFIDENTIAL Can we fit all the data and have a
dependable output? 28

Gemini 29 - 2M Tokens - Schema specification - MIME
type specification (e.g. application/json)

Tech Debt 30 You kinda get into dependency hell with
LLMs. Will it work if they upgrade the model?

CONFIDENTIAL Can we afford it? 31

CONFIDENTIAL 32 Gemini API Costs Input Pricing $1.25 / 1
million tokens Output Pricing $5.00 / 1 million tokens 1M tokens = ~130 pages 1M tokens = ~700 pages

CONFIDENTIAL What if it is not perfect? 33

CONFIDENTIAL 34 Models Will Get Better The pace of improvement
compared with human-level performance continues to expand across a wide-range of tasks. Source

CONFIDENTIAL 35 And More Accurate Models continue to get better
at controlling hallucinations. Source

CONFIDENTIAL 36 It’s Yours, Change it This is open source,
so tweak the prompts your your needs.

CONFIDENTIAL Can LLMs do math? 37

Ok. There Are Issues Here…. 38 …position improve from 7.85
to 9, an increase of 92.31%...

We Have to Be a Bit Creative 39 Just provide
current and prior values. We calculate the rest.

CONFIDENTIAL What context can we provide? 40

CONFIDENTIAL 41 Metrics Content Analysis: • Clean Content • Metadata
• Word Count • Heading Structure • Image Count • Schema Markup Page Speed Insights: • Largest Contentful Paint • Cumulative Layout Shift • Interaction to Next Paint • First Contentful Paint • Time to Interactive • Speed Index • Total Blocking Time • Performance Score • First Meaningful Paint Search Engine Performance: • Clicks (from Google Search Console) • Impressions (from Google Search Console) • Click-Through Rate CTR • Average Position (in search results) • Ranking Keywords • Top No-Click Queries

CONFIDENTIAL 42 Metrics Traffic and User Behavior: • Organic Sessions
• Organic Users • Organic New Users • Bounce Rate • Average Time on Page • Engagement Rate • Revenue • User Demographics • Device Categories • Pages Visited Prior • Pages Visited Next • Referring Sites

CONFIDENTIAL CONFIDENTIAL SEODP 43 Section 4

CONFIDENTIAL An eye in the cloud watching my content 44

CONFIDENTIAL 45 SEO Data Platform on Github

Requirements 46 1. A developer :-) 2. A few API
keys (most are free or very cheap)

Modular 47 1. Very clean and extensible extractor workflow. 2.
Intuitive to find and update code.

Customizable 48 You tell it what to report on.

Output 49 Insights in your inbox. Find it here: SEODP

Examples 50

Examples 51

Bonus 52 pip install repocoder

CONFIDENTIAL CONFIDENTIAL The End 53 Section null

JR Oakes - Leveraging LLMs To Extract Insights ...

JR Oakes - Leveraging LLMs To Extract Insights From Analytics Data

More Decks by Tech SEO Connect

Featured

Transcript