Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting the most out of Google Search Console API with RegEx

Getting the most out of Google Search Console API with RegEx

A talk given for SMX Advanced 2022 covering the Google Search Console API and how to use regex.

Getting the most out of the Google Search Console APIGoogle Search Console is an amazing tool that provides invaluable search data by real users directly from Google. While the charts and tables are friendly to work with, a large part of the data is not accessible from the UI. The only way to get to this hidden data is to use the API and extract all that valuable search data that is available to you, only if you know how.

After this session, you’ll be able to:
» Extract the maximum amount of data from the GSC API
» Utilize the GSC API in your SEO process with Spreadsheets, Data Studio, or python
» Learn how to use regular expressions (regEx) to filter difficult URLs

Eric Wu

May 24, 2022
Tweet

More Decks by Eric Wu

Other Decks in Marketing & SEO

Transcript

  1. @SPEAKERNAME/#SMX https://similar.ai/blog/closing-google-search-console-sampling-gap/ • Adding 10 well-chosen sub-directories as GSC profiles

    can close the gap to almost 75% • The gap saturates towards the end because of longer tail sub-directories
  2. @SPEAKERNAME/#SMX are, can, can't, could, couldn't, did, didn't, do, does,

    doesn't, how, if, is, isn't, should, shouldn't, was, wasn't, were, weren't, what, when, where, who, whom, whose, why, will, won't, would, wouldn't
  3. @SPEAKERNAME/#SMX aamaung, damsung, mamsang, sam sung, samaung, samdung, samesung, sameung,

    samgsung, samgung, samsang, samsaung, samsgu, samshgg, samshng, samsing, samsnug, samssung, samsu, samsuag, samsubg, samsubng, samsug, samsumg, samsumng, samsun g, samsunb, samsund, samsund, samsunh, samsunt …
  4. @SPEAKERNAME/#SMX samsung galaxy note galaxy samsung new samsung TV Consider:

    • Start of string • Surrounded by spaces • End of string 🔍 🔍 🔍
  5. @SPEAKERNAME/#SMX US Product URLs /<cat>/ /<cat>/<sub-cat>/p-<product> /tvs/ /tvs/4k-led/ /tvs/4k-led/p-4k-hd-model-314 i18n

    Language & Country URLs /<lang>/ /<lang>-<country>/ /fr/ /fr-br/ /ae-ar/ /fr/tvs/4k/ /fr-br/tvs/4k-led/p-4k-hd-model-314 /ae-ar/tvs/4k-led/p-4k-hd-model-314
  6. @SPEAKERNAME/#SMX Include /([^/]+/){1,2}p? • Any character that’s not a slash

    = [^/]+ • 1 or 2 directories = /){1,2} • Sometimes followed by a product slug = p? Get All US PLPs + PDPs and NOT i18n pages Exclude /[a-zA-Z]{2}|[a-zA-Z]{2}-[a-zA-Z]{2}/ • Any 2 letter directory = [a-zA-Z]{2} • 2 letter + 2 letter lang-country combo = [a-zA-Z]{2}-[a-zA-Z]{2}
  7. @SPEAKERNAME/#SMX { "rows": [ { "keys": ["2022-06-15"],"clicks": 359756,"impressions": 7294403,"ctr": 0.049319457671861563,"position":

    9.128287263536166}],"responseAggregationTy pe": "byPage" } Export JSON to file: samsung-g11n.json
  8. @SPEAKERNAME/#SMX { "rows": [ { "keys": ["2022-06-15"], "clicks": 359756, "impressions":

    7294403, "ctr": 0.049319457671861563, "position": 9.128287263536166 } ], "responseAggregationType": "byPage" } jq . < samsung-g11n.json
  9. @SPEAKERNAME/#SMX jq '.rows | [.[] | { _Date: .keys[0], Clicks:

    .clicks|tostring, Impressions: .impressions|tostring, CTR: .ctr, Position: .position }]' < samsung-g11n.json
  10. @SPEAKERNAME/#SMX jq '.rows | [.[] | { _Date: .keys[0], Clicks:

    .clicks|tostring, Impressions: .impressions|tostring, CTR: .ctr, Position: .position }]' < samsung-g11n.json > samsung.json
  11. @SPEAKERNAME/#SMX jq '.rows | [.[] | { _Date: .keys[0], Clicks:

    .clicks|tostring, Impressions: .impressions|tostring, CTR: .ctr, Position: .position }]' < samsung-g11n.json | dasel -r json -w csv > data.csv
  12. @SPEAKERNAME/#SMX After this session, you’ll be able to: • Extract

    the maximum amount of data from the GSC API • Utilize the GSC API in your SEO process with Spreadsheets, Data Studio, or python • Learn how to use regular expressions (regEx) to filter difficult URLs Getting the most out of the Google Search Console API Google Search Console is an amazing tool that provides invaluable search data by real users directly from Google. While the charts and tables are friendly to work with, a large part of the data is not accessible from the UI. The only way to get to this hidden data is to use the API and extract all that valuable search data that is available to you, only if you know how.