$30 off During Our Annual Pro Sale. View Details »

Automazione per la SEO: esempi pratici da un non-developer

Automazione per la SEO: esempi pratici da un non-developer

In questo intervento ho trattato l'utilizzo di strumenti e API per automatizzare attività quotidiane di analisi e monitoraggio.

Gianluca Campo

May 09, 2019
Tweet

More Decks by Gianluca Campo

Other Decks in Marketing & SEO

Transcript

  1. View Slide

  2. View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. View Slide




  15. View Slide






  16. My First Heading
    link1
    link2
    testo


    /html/body/a[1]/@href

    View Slide

  17. /axis::node-test[predicate]/axis::node-test[predicate]/axis::node-test[predicate]




    /locationstep/locationstep/locationstep

    View Slide

  18. /html/body/a[1]/@href
    /child::html/child::body/child::a[1]/attribute::href

    View Slide


  19. esempio di contenuto

    esempio di contenuto

    //nome-tag/@attributo

    View Slide





  20. → contains(str1, str2)
    → starts-with(str1, str2)
    /html/body/a[1]/@href
    //a[1]/@href
    //a[contains(@href, "link1.html")]
    //a[starts-with(@href, "link1")]

    View Slide

  21. View Slide


  22. //*[@id="menu-item-5015"]/a

    //ul[@id="menu-primary-items"]/li/a

    View Slide

  23. =IMPORTXML(url, xpath_query)


    View Slide

  24. =XPathOnUrl(url, xpath, attribute, xmlHTTPSettings, mode)



    View Slide



  25. View Slide




  26. View Slide



  27. View Slide

  28. from lxml import html
    import requests
    urls = open("urls.txt", "r")
    results_file = open("results.txt", "a+")
    for item in urls:
    url = item.rstrip("\n")
    page = requests.get(url)
    tree = html.fromstring(page.content)
    text = tree.xpath('//ul[@id="menu-primary-items"]/li/a/text()')
    results_file.write("%s,%s\n" % (url, text))
    print ("SCRAPING " + url)
    print (text, "\n")
    results_file.close()





    View Slide

  29. //title
    //meta[@name="description"]/@content
    //link[@hreflang="it-IT"]/@href
    //link[contains(@hreflang, *)]/@href
    //link[@rel="canonical"]/@href
    //meta[@name="robots"]/@content
    //h1
    //url/loc/text()

    View Slide

  30. View Slide






  31. View Slide

  32. https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX&start-date=2019-
    01-01&end-date=2019-03-
    31&metrics=ga:sessions&filter
    s=ga:country==Italy&access_to
    ken=XXXX
    https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX
    &start-date=2019-01-01
    &end-date=2019-03-31
    &metrics=ga:sessions
    &filters=ga:country==Italy
    &access_token=XXXX



    View Slide

  33. https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX
    &start-date=2019-01-01
    &end-date=2019-03-31
    &metrics=ga:sessions
    &filters=ga:country==Italy

    View Slide

  34. https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX
    &start-date=2019-01-01
    &end-date=2019-03-31
    &metrics=ga:sessions
    &filters=ga:country==Italy



    View Slide

  35. https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX
    &start-date=2019-01-01
    &end-date=2019-03-31
    &metrics=ga:sessions
    &filters=ga:country==Italy




    View Slide

  36. https://www.googleapis.com/an
    alytics/v3/data/ga
    ?ids=ga:XXXX
    &start-date=2019-01-01
    &end-date=2019-03-31
    &metrics=ga:sessions
    &filters=ga:country==Italy



    View Slide




  37. ga:name operator expression
    ga:country == Italy

    View Slide

  38. View Slide



  39. View Slide

  40. View Slide

  41. View Slide



  42. View Slide

  43. View Slide







  44. View Slide



  45. https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query
    {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }

    View Slide

  46. https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query
    {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }

    View Slide

  47. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }



    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide

  48. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }


    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide

  49. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }


    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide




  50. "dimension": string, "operator": string, "expression": string
    "dimension": country, "operator": equals, "expression": ITA

    View Slide

  51. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }


    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide

  52. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }


    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide

  53. {
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": 25000
    "startRow": 0
    }




    https://www.googleapis.com/webmasters/v3/s
    ites/XXXX/searchAnalytics/query

    View Slide




  54. View Slide




  55. View Slide



  56. View Slide

  57. View Slide




  58. View Slide

  59. ...
    rowLimit = 25000
    retrieve_search_queries = webmasters_service.searchanalytics().query(
    siteUrl='ENTER-YOURS-HERE',
    body={
    "startDate": "2019-01-01",
    "endDate": "2019-03-31",
    "dimensions": ["query"],
    "dimensionFilterGroups": [
    {
    "filters": [
    {
    "dimension": "country",
    "operator": "equals",
    "expression": "ITA"
    }
    ]
    }
    ],
    "aggregationType": "auto",
    "rowLimit": rowLimit
    }
    ).execute()
    results_file = open("results.txt", "a+")
    for i in range(0, rowLimit):
    keys = retrieve_search_queries['rows'][i]['keys']
    impressions = retrieve_search_queries['rows'][i]['impressions']
    clicks = retrieve_search_queries['rows'][i]['clicks']
    ctr = retrieve_search_queries['rows'][i]['ctr']
    position = retrieve_search_queries['rows'][i]['position']
    print ("%s|%s|%s|%s|%s\n" % (keys, impressions, clicks, ctr, position))
    results_file.write ("%s|%s|%s|%s|%s\n" % (keys, impressions, clicks, ctr, position))
    results_file.close()

    View Slide

  60. View Slide

  61. View Slide




  62. View Slide




  63. https://adwords.google.com/api/adwords/cm/
    v201809/CampaignService

    xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    ...


    ...


    View Slide






  64. View Slide



  65. View Slide










  66. View Slide





  67. View Slide




  68. View Slide



  69. View Slide



  70. View Slide



  71. View Slide



  72. View Slide







  73. View Slide






  74. View Slide





  75. View Slide




  76. View Slide

  77. View Slide

  78. View Slide




  79. View Slide




  80. View Slide









  81. View Slide

  82. ...
    def main(client, item, ad_group_id=None):
    # Initialize appropriate service.
    targeting_idea_service = client.GetService(
    'TargetingIdeaService', version='v201809')
    # Construct selector object and retrieve related keywords.
    selector = {
    'ideaType': 'KEYWORD',
    'requestType': 'STATS'
    }
    selector['requestedAttributeTypes'] = [
    'KEYWORD_TEXT', 'SEARCH_VOLUME']
    offset = 0
    selector['paging'] = {
    'startIndex': str(offset),
    'numberResults': str(PAGE_SIZE)
    }
    selector['searchParameters'] = [{
    'xsi_type': 'RelatedToQuerySearchParameter',
    'queries': item
    }]
    # Language setting (optional).
    selector['searchParameters'].append({
    # The ID can be found in the documentation:
    # https://developers.google.com/adwords/api/docs/appendix/languagecodes
    'xsi_type': 'LanguageSearchParameter',
    'languages': [{'id': '1004'}]
    })
    # Location setting (optional).
    selector['searchParameters'].append({
    # The ID can be found in the documentation:
    # https://developers.google.com/adwords/api/docs/appendix/geotargeting
    'xsi_type': 'LocationSearchParameter',
    'locations': [{'id': '2380'}]
    })
    # Network search parameter (optional)
    selector['searchParameters'].append({
    'xsi_type': 'NetworkSearchParameter',
    'networkSetting': {
    'targetGoogleSearch': True,
    'targetSearchNetwork': False,
    'targetContentNetwork': False,
    'targetPartnerSearchNetwork': False
    }
    })







    View Slide

  83. ...
    # Display results.
    if 'entries' in page:
    for result in page['entries']:
    attributes = {}
    for attribute in result['data']:
    attributes[attribute['key']] = getattr(attribute['value'], 'value', '0')
    results_file.write('%s|%s|%s\n' % (item, attributes['KEYWORD_TEXT'], attributes['SEARCH_VOLUME']))
    print ('%s|%s|%s' % (item, attributes['KEYWORD_TEXT'], attributes['SEARCH_VOLUME']))
    print
    else:
    print ('No related keywords were found.')
    offset += PAGE_SIZE
    selector['paging']['startIndex'] = str(offset)
    more_pages = offset < int(page['totalNumEntries'])
    if __name__ == '__main__':
    # Initialize client object.
    adwords_client = adwords.AdWordsClient.LoadFromStorage("ABSOLUTE-PATH-TO-googleads.yaml")
    adwords_client.SetClientCustomerId('ENTER-YOURS-HERE')
    kwds = open("kwds.txt","r")
    #reload(sys)
    #sys.setdefaultencoding('utf-8')
    for line in kwds:
    item = line.strip()
    results_file = open("results.txt", "a+")
    main(adwords_client, item, int(AD_GROUP_ID) if AD_GROUP_ID.isdigit() else None)
    print(datetime.datetime.now())
    results_file.close()
    sleep(2)




    View Slide

  84. ...
    # Construct selector object and retrieve related
    keywords.
    selector = {
    'ideaType': 'KEYWORD',
    'requestType': ‘IDEAS'
    }
    selector['requestedAttributeTypes'] = [
    'KEYWORD_TEXT', 'SEARCH_VOLUME']
    offset = 0
    selector['paging'] = {
    'startIndex': str(offset),
    'numberResults’: 10
    }
    ...

    View Slide

  85. View Slide

  86. View Slide

  87. from lxml import html
    import requests
    urls = open("urls.txt", "r")
    results_file = open("results.txt", "w")
    for item in urls:
    url = item.rstrip("\n")
    page = requests.get(url)
    tree = html.fromstring(page.content)
    text = tree.xpath('//h3[@class="r"]/a/@href')
    results_file.write("%s,%s\n" % (url, text))
    print ("SCRAPING " + url)
    print (text, "\n")
    results_file.close()




    View Slide





  88. https://www.google.[com]/search?q=site:[dominio]&start=[#pagina]&...

    View Slide




  89. /url?q=http://www.simpleagency.it/&sa=U&ved=0ahUKEwizuOnv1YTiAhU9GLkGHQUZAe8QFggUMAA&
    usg=AOvVaw2SLUR7xqI7OaMms1_bXQ3h

    View Slide

  90. ...
    #download and store new html file
    os.rename('/home/giancampo/diff-html/new_html.html',
    '/home/giancampo/diff-html/old_html.html')
    url = ‘YOUR-HOMEPAGE-URL'
    response = urllib2.urlopen(url)
    webContent = response.read()
    f = open('/home/giancampo/diff-html/new_html.html', 'w')
    f.write(webContent)
    f.close()
    #convert html to txt files
    html1 = open('/home/giancampo/diff-html/old_html.html').read()
    html2 = open('/home/giancampo/diff-html/new_html.html').read()
    old_file = html2text.html2text(html1)
    new_file = html2text.html2text(html2)
    #write text into txt files
    old_text = open('/home/giancampo/diff-html/old_text.txt', 'w')
    new_text = open('/home/giancampo/diff-html/new_text.txt', 'w')
    old_text.write(old_file)
    new_text.write(new_file)
    old_text.close()
    new_text.close()
    ...


    View Slide

  91. ...
    #send an email if the script has found differences
    if filecmp.cmp('/home/giancampo/diff-html/old_text.txt',
    '/home/giancampo/diff-html/new_text.txt') == True:
    print 'no emails sent'
    else:
    gmail_user = ‘YOUR-GMAIL-ADDRESS'
    gmail_password = YOUR-GMAIL-PASSWORD'
    sent_from = gmail_user
    to = ['[email protected]']
    subject = 'Changes in the homepage!'
    body = _diff
    email_text = '''From: %s\nTo: %s\nSubject: %s\n\n%s''' % (sent_from,
    ', '.join(to), subject, body)
    server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
    server.ehlo()
    server.login(gmail_user, gmail_password)
    server.sendmail(sent_from, to, email_text)
    server.close()
    print 'Email sent!'
    #files closing
    diff_file.close()



    View Slide

  92. View Slide

  93. View Slide

  94. View Slide







  95. View Slide

  96. View Slide

  97. View Slide

  98. View Slide

  99. View Slide

  100. View Slide

  101. View Slide

  102. View Slide

  103. View Slide

  104. View Slide

  105. View Slide

  106. View Slide

  107. View Slide

  108. View Slide



  109. View Slide



  110. ...
    if __name__ == '__main__':
    # Initialize client object.
    adwords_client =
    adwords.AdWordsClient.LoadFromStorage("C:\\Users\\gianl\\AppDa
    ta\\Local\\Programs\\Python\\Python37\\_i miei
    script\\adwords-api\\googleads.yaml")
    adwords_client.SetClientCustomerId('ENTER-YOURS-HERE')
    kwds = open("kwds.txt","r")
    reload(sys)
    sys.setdefaultencoding('utf-8')
    for line in kwds:
    item = line.strip()
    results_file = open("results.txt", "a+")
    main(adwords_client, item, int(AD_GROUP_ID) if
    AD_GROUP_ID.isdigit() else None)
    print(datetime.datetime.now())
    results_file.close()
    sleep(2)

    View Slide