Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WTSFest Philadelphia 2024: This is BigQuery

WTSFest Philadelphia 2024: This is BigQuery

In this talk, Sam worked to demystify BigQuery and help marketers learn how to navigate the tool to make their data better and their reporting more performant. These slides help you to connect BigQuery to GSC and GA4 to get more out of your data without incurring extra costs.

Sam Torres

October 03, 2024
Tweet

More Decks by Sam Torres

Other Decks in Marketing & SEO

Transcript

  1. 1

  2. Hi! I’m Sam Torres Gray Dot Co Official role Chief

    Digital Officer at Gray Dot Co Unofficial (but important) roles Data & dev nerd SEO mermaid™ Day-to-day focus Creating solutions where search, data, and development intersect. #WTSFest @SamTorresATL
  3. Gray Dot Co Have you ever felt like shaking your

    fist at… #WTSFest @SamTorresATL
  4. Gray Dot Co …or data costs that are out of

    control? #WTSFest @SamTorresATL
  5. Well, some good news… Gray Dot Co Own your data

    Work with bigger datasets Scale without limitations Connect data from multiple sources I’ll share how BigQuery solves many of the internal data pain points we’ve all experienced. So you can… #WTSFest @SamTorresATL
  6. Gray Dot Co We’ll cover… Is BigQuery worth it? Setting

    up your account & connecting to data sources Navigating BigQuery Building aggregate tables Setting up budget alerts #WTSFest @SamTorresATL
  7. First things first, what is it? Gray Dot Co “BigQuery

    is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery’s serverless architecture lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes.” #WTSFest @SamTorresATL
  8. Just kidding, I fixed it… Gray Dot Co “BigQuery is

    a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery’s serverless architecture lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes.” #WTSFest @SamTorresATL
  9. The benefits of leveraging BigQuery Gray Dot Co 🤩 Unsampled

    data 🤩 Data scales as you do 🤩 1 warehouse for many sources 🤩 Lower costs & budget alerts 🤩 Big requests in seconds 🤩 Access data back to day 1 😭 Platforms report sampled data 😭 Data setup isn’t scalable 😭 Data is in many places 😭 High data-storage costs 😭 Big requests break or are slow 😭 Historical data is limited Before BQ With BQ #WTSFest @SamTorresATL
  10. Create a billing account in Google Cloud Gray Dot Co

    Go to console.cloud.google.com and select Billing SETTING UP YOUR ACCOUNT #WTSFest @SamTorresATL
  11. 1. Create a project in Google Cloud Gray Dot Co

    CONNECTING TO GSC #WTSFest @SamTorresATL
  12. 2. Enable the BigQuery API Gray Dot Co CONNECTING TO

    GSC #WTSFest @SamTorresATL Can’t find this page? Go to > APIs & Services and search for “BigQuery”
  13. 2. Add your project ID Gray Dot Co CONNECTING TO

    GSC #WTSFest @SamTorresATL Find your ID in the Project Overview page: > Cloud Overview
  14. 2. Add your project ID Gray Dot Co CONNECTING TO

    GSC #WTSFest @SamTorresATL Find your ID in the Project Overview page: > Cloud Overview
  15. 3. Set the name & location Gray Dot Co CONNECTING

    TO GSC #WTSFest @SamTorresATL Useful if you’re bringing in multiple GSC properties Set the country that your data governance is based in
  16. 1. Create a project in Google Cloud Gray Dot Co

    CONNECTING TO GA4 #WTSFest @SamTorresATL
  17. 1. Create a project in Google Cloud Gray Dot Co

    CONNECTING TO GA4 #WTSFest @SamTorresATL Or you can use the same project. Consider data storage size and separate budget alerts/ ceiling needs.
  18. 3. Configure streams & events Gray Dot Co CONNECTING TO

    GA4 #WTSFest @SamTorresATL Selecting the right location is extremely important!
  19. 3. Configure streams & events Gray Dot Co CONNECTING TO

    GA4 #WTSFest @SamTorresATL Limit is 1M events per day. Take what you need.
  20. BigQuery’s pricing structure Gray Dot Co #WTSFest @SamTorresATL Storage Computation

    💸 How much data is contained within BQ 💸 Free tier gets you 10 GB of data storage 💸 How much handling of that data is BQ doing 💸 Free tier gets you 1Tb of computation or processing
  21. The sneaky build of compute costs Gray Dot Co #WTSFest

    @SamTorresATL This single branded impression chart is filtering for specific words across queries, grouping according to month and summing impressions.
  22. For GSC, your data includes Gray Dot Co #WTSFest @SamTorresATL

    Site Data Individual URL Data Great for seeing total site performance Includes page-level performance Where SERP Features data is found
  23. In GA4, your data includes Gray Dot Co #WTSFest @SamTorresATL

    Table PER day Nested columns (the details that apply to specific types of data)
  24. How do I wade through all of these tables?! Gray

    Dot Co #WTSFest @SamTorresATL
  25. The basics of query structure Gray Dot Co Retrieve data

    from tables or expressions. SELECT <field>,<field> FROM <database table> WHERE <field>=’Value’ ORDER BY <field> Which table(s) should the data be retrieved from? Which rows should be retrieved? Which columns should dictate sorting, in which direction? #WTSFest @SamTorresATL
  26. A query can be…simple Gray Dot Co #WTSFest @SamTorresATL SELECT

    * FROM `table.searchconsole.searchdata_ site_impression` WHERE data_date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY) AND DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) AND country = 'usa' AND query IS NOT NULL
  27. A query can be…complex Gray Dot Co #WTSFest @SamTorresATL SELECT

    CASE WHEN session_source IS NULL THEN 'Direct' WHEN REGEXP_CONTAINS(session_campaign_name, 'cross-network') THEN 'Cross-network' WHEN ( REGEXP_CONTAINS( session_source, 'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart' ) OR REGEXP_CONTAINS( session_campaign_name, '^(.*(([^a-df-z]|^)shop|shopping).*)$' ) ) AND REGEXP_CONTAINS(session_medium, '^(.*cp.*|ppc|paid.*)$') THEN 'Paid Shopping' WHEN REGEXP_CONTAINS( session_source, 'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex' ) AND REGEXP_CONTAINS(session_medium, '^(.*cp.*|ppc|paid.*)$') THEN 'Paid Search' WHEN REGEXP_CONTAINS( session_source, 'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp' ) AND REGEXP_CONTAINS(session_medium, '^(.*cp.*|ppc|paid.*)$') THEN 'Paid Social' WHEN REGEXP_CONTAINS( session_source, 'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube' ) AND REGEXP_CONTAINS(session_medium, '^(.*cp.*|ppc|paid.*)$') THEN 'Paid Video' WHEN session_medium IN ( 'display', 'banner', 'expandable', 'interstitial', 'cpm' ) THEN 'Display' WHEN REGEXP_CONTAINS( session_source, 'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart' ) OR REGEXP_CONTAINS( session_campaign_name, '^(.*(([^a-df-z]|^)shop|shopping).*)$' ) THEN 'Organic Shopping' WHEN REGEXP_CONTAINS( session_source, 'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp' ) OR session_medium IN ( 'social', 'social-network', 'social-media', 'sm', 'social network', 'social media' ) THEN 'Organic Social' WHEN REGEXP_CONTAINS( session_source, 'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube' ) OR REGEXP_CONTAINS(session_medium, '^(.*video.*)$') THEN 'Organic Video' WHEN REGEXP_CONTAINS( session_source, 'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex' ) OR session_medium = 'organic' THEN 'Organic Search' WHEN REGEXP_CONTAINS(session_source, 'email|e-mail|e_mail|e mail') OR REGEXP_CONTAINS(session_medium, 'email|e-mail|e_mail|e mail') THEN 'Email' WHEN session_medium = 'affiliate' THEN 'Affiliates' WHEN session_medium = 'referral' THEN 'Referral' WHEN session_medium = 'audio' THEN 'Audio' WHEN session_medium = 'sms' THEN 'SMS' WHEN session_medium LIKE '%push' OR REGEXP_CONTAINS(session_medium, 'mobile|notification') THEN 'Mobile Push Notifications' ELSE 'Unassigned' END AS session_channel, DATE, device_category, COUNT(DISTINCT session_id) AS sessions, COUNT(DISTINCT new_users) AS new_users, COUNT(DISTINCT user_pseudo_id) AS total_users, SUM(purchases) AS purchases, SUM(revenue) AS revenue, COUNT( DISTINCT CASE WHEN new_users IS NOT NULL AND total_purchasers IS NOT NULL THEN user_pseudo_id ELSE NULL END ) AS first_time_purchasers, COUNT(DISTINCT total_purchasers) AS total_purchasers
  28. Either way, tools can write it for you Gray Dot

    Co I generally use GA4SQL but that’s just one example. #WTSFest @SamTorresATL
  29. Gray Dot Co Enter…aggregated tables! CREATE TABLES AND JOB TO

    RUN DAILY CONNECT TO LOOKER STUDIO USE YOUR MEASUREMENT PLAN #WTSFest @SamTorresATL
  30. 1. Decide what kind of data you want Gray Dot

    Co CREATING AGGREGATE TABLES GSC - Queries & metrics aggregated by: • Page • Device Type • Country GA4 - Event data aggregated by: • Entry page • Page path • Channel grouping #WTSFest @SamTorresATL
  31. 1. Decide what kind of data you want Gray Dot

    Co CREATING AGGREGATE TABLES GSC - Queries & metrics aggregated by: • Page • Device Type • Country GA4 - Event data aggregated by: • Entry page • Page path • Channel grouping Start with only a few. You can always add more later - and the query will go through your entire set of data (meaning you can aggregate retroactively!) #WTSFest @SamTorresATL
  32. 2. Run the SQL query in Explorer Gray Dot Co

    CREATING AGGREGATE TABLES #WTSFest @SamTorresATL
  33. 2. Run the SQL query in Explorer Gray Dot Co

    CREATING AGGREGATE TABLES #WTSFest @SamTorresATL Here, you can see the compute cost. If you’re testing, try adding a limit to your query until you’re happy with the data you’re getting. Then, open the query (and number of results). LIMIT 1000
  34. 2. Run the SQL query in Explorer Gray Dot Co

    CREATING AGGREGATE TABLES #WTSFest @SamTorresATL
  35. 3. Save Results to a New Table Gray Dot Co

    CREATING AGGREGATE TABLES #WTSFest @SamTorresATL
  36. 3. Save Results to a New Table Gray Dot Co

    CREATING AGGREGATE TABLES #WTSFest @SamTorresATL
  37. 4. Duplicate the query and adjust to yesterday Gray Dot

    Co CREATING AGGREGATE TABLES #WTSFest @SamTorresATL
  38. 5. Schedule the new query Gray Dot Co CREATING AGGREGATE

    TABLES #WTSFest @SamTorresATL Don’t remember your Table ID?
  39. 6. Use your shiny new tables for Looker Studio Gray

    Dot Co CREATING AGGREGATE TABLES • Don’t forget that you can blend tables in Looker Studio too! • Why does this save on processing & time? ◦ Calculations happen in BQ ◦ Larger grouping calculations only happen once a day #WTSFest @SamTorresATL
  40. 1. Create a budget Gray Dot Co It can be

    for one or more projects, or the whole cloud account (which is the default). #WTSFest @SamTorresATL
  41. 1. Create a budget Gray Dot Co It can be

    for one or more projects, or the whole cloud account (which is the default). #WTSFest @SamTorresATL
  42. 2. Specify a budget Gray Dot Co #WTSFest @SamTorresATL Note

    that you can set this for specific organization, projects or even services (BigQuery is a service).
  43. 3. Create alert thresholds Gray Dot Co #WTSFest @SamTorresATL Need

    to send alerts to different emails than the project owner? You can do that here.
  44. Sample Budget Alerts Gray Dot Co #WTSFest @SamTorresATL Set cost

    thresholds: • Last month’s spend • ANY spending (my personal favorite!) Limit to specific services: • BigQuery Storage • BigQuery Data Transfer Service: Responsible for the scheduled queries