Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gettin' Data from the Internet

Gettin' Data from the Internet

A tutorial session on using Python to deal with (non-oauth-requiring) REST APIs for SOIAR at the U-M School of Information. Given November 2014.

jczetta

May 14, 2015
Tweet

More Decks by jczetta

Other Decks in Programming

Transcript

  1. gettin’ data from the internet (slides) Python + API requests

    SOIAR (Un)Supervised Learning November 25, 2014
  2. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill -> omg all the data why?
  3. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! (the hardest part is the documentation) why?
  4. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! -> useful why?
  5. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! -> useful -> nifty why?
  6. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill why?
  7. Knowing a little bit about making requests to APIs to

    get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill -> omg all the data why?
  8. • you’ve programmed in Python a little bit (e.g. SI

    502) and have it on your computer (if you don’t, we may be able to help, but this session assumes you can program a little in Python already) here’s what we are assuming
  9. • you’ve programmed in Python a little bit (e.g. SI

    502) • you feel okay about assigning values to variables here’s what we are assuming
  10. • you’ve programmed in Python a little bit (e.g. SI

    502) • you feel okay about assigning values to variables • you know basically what a Python library (aka package, or module) is here’s what we are assuming
  11. • you’ve programmed in Python a little bit (e.g. SI

    502) • you feel okay about assigning values to variables • you know basically what a Python library (aka package, or module) is • you can save and run a Python program here’s what we are assuming
  12. • you’ve programmed in Python a little bit (e.g. SI

    502) • you feel okay about assigning values to variables • you know basically what a Python library (aka package, or module) is • you can save and run a Python program • you’ve seen a Python dictionary before here’s what we are assuming
  13. a thing that holds data. has key:value pairs keys unique,

    values whatever, can be changed e.g. x = {‘MSI’:326,’MHI’:162,’BSI’:247} print x[‘MSI’] # will print 326 x[‘MHI’] = 225 # now equals 225 python dictionaries: quick! refresher
  14. • you know what an API is • you program

    in Python often/recently (as long as you’re OK with jumping back in now) here’s what we are NOT assuming
  15. Application Programming Interface OR, more colloquially, a set of pre-described

    ways of doing things with code for a specific service (e.g. Flickr) so you can get some data or make stuff happen what is an API?
  16. • endpoints (the location of the data on the web)

    • documentation (or so we hope…) • other things we aren’t going to talk about today web APIs have several things
  17. 1. If you use the right Python library/libraries, 2. and

    if you know the right URL, 3. and if you know what parts of the URL to edit, 4. and if you feel OK reading documentation... here’s what you NEED to know
  18. … then you can get some structured data! (gosh I

    love me some structured data.) here’s what you NEED to know
  19. you’ll need two things: • if you have never used

    pip, download and save the file found here, to your desktop: http: //bit.ly/getpip-download • go to https://8tracks.com/developers/new and get a developer key (you’ll need an account). copy and paste it into a text file. we’re going to get data from 8tracks
  20. except it will say your name. the key is longer

    than that. copy it so you can use it later! when you go to that link
  21. pip is a program that makes it easy to install

    Python modules, and there’s one we need. that file you downloaded (get-pip.py) is a program that will make pip easy to install. now, installing pip
  22. MAC: (in your Terminal window) cd ~/Desktop sudo python get-pip.py

    WINDOWS (in whatever command prompt you use): run the Python program get-pip.py from your Desktop however you do (sorry no screenshot) installing pip
  23. MAC: 1. open your Terminal 2. type: sudo pip install

    requests (you will then probably need to type your computer password. you will not be able to see it as you type.) using pip to install a module
  24. http://docs.python-requests.org/ it’s a Python module (“library”, call it whatever) that

    basically makes it really easy to make API requests. great tools! for humans to use! pretty sweet. requests is great
  25. get ready to open your text editor and start a

    Python program. it’s time to get some data from 8tracks. ….after we talk about APIs a little more. NOW YOU’RE ALL SET
  26. an API request URL is a URL that describes where

    data is. making an API request means using that URL to get data of some kind. anatomy of an API request
  27. an API request is made up of a few parts.

    [the domain/base url] - this you just get, e.g. https://8tracks.com/developers/api_v3 [parameters] - api documentation explains what you need here, what they should look like, and whether they’re optional anatomy of an API request
  28. also, [separator] - between the domain/base url and the parameters,

    you need a ‘?’ character, usually url parameters and parameter values -depend on what data you want to get! -the documentation will tell you what your options are. -many of them are really simple. more on the anatomy of an API request
  29. so again, the pieces of a request URI: baseurl ->

    ‘where it’s ALL stored’ ? -> to separate the base from parameters parameters -> ‘specifically, I want this stuff’ anatomy of an API request
  30. when you put an API request URL in a web

    browser, you’re basically saying: ‘this URL describes a place on the internet with data’, and ‘I want to see that data in plain text in my web browser’ anatomy of an API request
  31. when you use an API request URL in a program,

    you’re basically saying: ‘this URL describes a place on the internet with data’, and ‘I want to GET that data so I can do stuff with it using Python, on my computer’ anatomy of an API request
  32. The APIs we’re talking about are called REST APIs. That

    stands for Representational State Transfer. This is a big(ish) subject, but basically, We care about 1 thing related to this right now. RESTful API services
  33. API endpoints are websites for programs. No buttons or pretty

    things (the way most people talk about pretty things) because they’re not for people to consume directly. They’re for programs. RESTful API services
  34. 1. build a string that is the correct URL which

    specifies where the data you want is 2. use a Python library to say ‘hey, go get the data that’s <at this url place> 3. use a Python library to say ‘hey, let’s make this stuff into a Python thingy so I can easily get data in my program’ process of MAKING A REQUEST
  35. # basically import requests url = http:// … # put

    baseurl here params = {} # add stuff to dict resp = requests.get(url, params) python_resp = resp.json() # dict let’s write some code now!