Upgrade to Pro — share decks privately, control downloads, hide ads and more …

15-437 Google Custom Search

ThierrySans
November 03, 2013

15-437 Google Custom Search

ThierrySans

November 03, 2013
Tweet

More Decks by ThierrySans

Other Decks in Education

Transcript

  1. Our web application today Build a web application for searching

    SML definitions from the SML Basis Standard Library ✓ We will use Google custom search for building this app ... because we know you like SML!
  2. The Google Custom Search API ➡ Use the Google Custom

    Search API to search over a website or a collection of websites and to embed the results in your web application
  3. 3 ways to use Google Custom search • Google snippet

    (client side) • Google Javascript API (client side) • Google Python API (server side)
  4. Step 4 - Paste the code snippet in your page

    ... <body> <script> (function() { var cx = '009372546338520416759:-rdtwkgor4i'; var gcse = document.createElement('script'); gcse.type = 'text/javascript'; gcse.async = true; gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') + '//www.google.com/cse/cse.js?cx=' + cx; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(gcse, s); })(); </script> <gcse:search></gcse:search> <body></html> search/templates/search/index.html Your Custom Search Identifier
 (aka CX or CSEid) The div element where to show the results
  5. Advantages and Limitations ✓ Very easy to use ๏ The

    results are shown in a dedicated DOM that is not easy to customize
  6. Solution ➡ On the client side 1. query the Google

    search using the Google Javascript API 2. process the results using Javascript 3. show the results in the page
  7. Search Request function client() { $('#result').html('<script src="https://www.googleapis.com/customsearch/v1?\ key=AIzaSyDBrtHVzXjrvKus_qGpPbTm9rJppIhjxvM&\ cx=004982958264606934300:2o52rkqdfaw&\ q='+$('#inp').val()+'&\

    callback=hndlr"></script>'); } function hndlr(response) { .... } search2/static/js/script.js Your API key Your custom search engine id Your search query Your callback method
  8. The callback method ➡ The call method (called hndlr in

    the example) defines what to do with the search results
  9. Example function hndlr(response) { page = “” for (var i

    = 0; i < response.items.length; i++) { var item = response.items[i]; var link = item.link; page += "<br/><div><a onclick=\"go('"+link+"');\" href='#'>" +item.htmlTitle +"</a><br/>" +item.htmlSnippet+ + "<br/><a onclick=\"go('"+link+"');\" href='#'>" +link+"</a></div><br/>"; } $('#result').html(page); $('#result').show(); } search2/static/js/script.js For each search ... display the search result but change the target url address so that the link opens in the current page rather than opening a new one
  10. Problem ๏ Nothing happens when you click on the link

    ➡ No bug shown except ... This is a Cross Domain error
  11. Cross Domain restriction (aka Same Origin Policy) ”The policy permits

    scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but prevents access to most methods and properties across pages on different sites.” Wikipedia
  12. Solution - use an iframe • An <iframe> can contains

    another HTML document possibly coming from another domain ➡ Like a webpage inside a webpage ✓ An iFrame is acting like a fence between the two documents <html> . . . <iframe> </iframe> . . . </html> <html> </html> Javascript code from the inner document cannot access the resources from the main document and vice versa
  13. Solution to our problem function hndlr(response) { page = “”

    for (var i = 0; i < response.items.length; i++) { var item = response.items[i]; var link = item.link; page += "<br/><div><a onclick=\"go('"+link+"');\" href='#'>" +item.htmlTitle +"</a><br/>" +item.htmlSnippet+ + "<br/><a onclick=\"go('"+link+"');\" href='#'>" +link+"</a></div><br/>"; } $('#result').html(page); $('#result').show(); } search2/static/js/script.js
  14. Advantages and Limitations ✓ Using an iframe is secure ๏

    We cannot modify the page contained in the iframe ➡ We want the define new CSS and Javascript controls for the result document
  15. Solution ➡ On the server side 1. query the Google

    search using the Python Google API 2. process the results using Python 3. return the results to the client
  16. Useful Libraries The Python Google Api
 http://code.google.com/p/google-api-python-client/downloads/list ➡ Google APIs

    Client Library for Python $ pip install --upgrade google-api-python-client
 BeautifulSoup
 http://www.crummy.com/software/BeautifulSoup/ ➡ HTML5 parser (and normalizer) $ pip install BeautifulSoup
  17. Example from apiclient.discovery import build from BeautifulSoup import BeautifulSoup import

    urllib2 import re @csrf_exempt def find(request): try: service = build("customsearch","v1",\ developerKey="AIzaSyDBrtHVzXjrvKus_qGpPbTm9rJppIhjxvM") res = service.cse().list(q=request.POST['key'],\ cx='004982958264606934300:2o52rkqdfaw').execute() page = urllib2.urlopen(res['items'][0]['link']) soup = BeautifulSoup(page).body except: return HttpResponse("error") else: return HttpResponse(soup) search3/views.py