Slide 1

Slide 1 text

gettin’ data from the internet (slides) Python + API requests SOIAR (Un)Supervised Learning November 25, 2014

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill -> omg all the data why?

Slide 4

Slide 4 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! (the hardest part is the documentation) why?

Slide 5

Slide 5 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! -> useful why?

Slide 6

Slide 6 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! -> useful -> nifty why?

Slide 7

Slide 7 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill why?

Slide 8

Slide 8 text

Knowing a little bit about making requests to APIs to get data is -> easy not very hard! -> useful -> nifty -> a massively reusable skill -> omg all the data why?

Slide 9

Slide 9 text

● you’ve programmed in Python a little bit (e.g. SI 502) and have it on your computer (if you don’t, we may be able to help, but this session assumes you can program a little in Python already) here’s what we are assuming

Slide 10

Slide 10 text

● you’ve programmed in Python a little bit (e.g. SI 502) ● you feel okay about assigning values to variables here’s what we are assuming

Slide 11

Slide 11 text

● you’ve programmed in Python a little bit (e.g. SI 502) ● you feel okay about assigning values to variables ● you know basically what a Python library (aka package, or module) is here’s what we are assuming

Slide 12

Slide 12 text

● you’ve programmed in Python a little bit (e.g. SI 502) ● you feel okay about assigning values to variables ● you know basically what a Python library (aka package, or module) is ● you can save and run a Python program here’s what we are assuming

Slide 13

Slide 13 text

● you’ve programmed in Python a little bit (e.g. SI 502) ● you feel okay about assigning values to variables ● you know basically what a Python library (aka package, or module) is ● you can save and run a Python program ● you’ve seen a Python dictionary before here’s what we are assuming

Slide 14

Slide 14 text

a thing that holds data. has key:value pairs keys unique, values whatever, can be changed e.g. x = {‘MSI’:326,’MHI’:162,’BSI’:247} print x[‘MSI’] # will print 326 x[‘MHI’] = 225 # now equals 225 python dictionaries: quick! refresher

Slide 15

Slide 15 text

● you know what an API is ● you program in Python often/recently (as long as you’re OK with jumping back in now) here’s what we are NOT assuming

Slide 16

Slide 16 text

Application Programming Interface OR, more colloquially, a set of pre-described ways of doing things with code for a specific service (e.g. Flickr) so you can get some data or make stuff happen what is an API?

Slide 17

Slide 17 text

in other words... you don’t have to be up all night to get data.

Slide 18

Slide 18 text

● endpoints (the location of the data on the web) ● documentation (or so we hope…) ● other things we aren’t going to talk about today web APIs have several things

Slide 19

Slide 19 text

1. If you use the right Python library/libraries, 2. and if you know the right URL, 3. and if you know what parts of the URL to edit, 4. and if you feel OK reading documentation... here’s what you NEED to know

Slide 20

Slide 20 text

… then you can get some structured data! (gosh I love me some structured data.) here’s what you NEED to know

Slide 21

Slide 21 text

you’ll need two things: ● if you have never used pip, download and save the file found here, to your desktop: http: //bit.ly/getpip-download ● go to https://8tracks.com/developers/new and get a developer key (you’ll need an account). copy and paste it into a text file. we’re going to get data from 8tracks

Slide 22

Slide 22 text

register here for a key... when you go to that link

Slide 23

Slide 23 text

except it will say your name. the key is longer than that. copy it so you can use it later! when you go to that link

Slide 24

Slide 24 text

pip is a program that makes it easy to install Python modules, and there’s one we need. that file you downloaded (get-pip.py) is a program that will make pip easy to install. now, installing pip

Slide 25

Slide 25 text

MAC: (in your Terminal window) cd ~/Desktop sudo python get-pip.py WINDOWS (in whatever command prompt you use): run the Python program get-pip.py from your Desktop however you do (sorry no screenshot) installing pip

Slide 26

Slide 26 text

MAC: 1. open your Terminal 2. type: sudo pip install requests (you will then probably need to type your computer password. you will not be able to see it as you type.) using pip to install a module

Slide 27

Slide 27 text

WINDOWS: - open your command prompt - type: /C/Python27/Scripts/pip install requests using pip to install a module

Slide 28

Slide 28 text

http://docs.python-requests.org/ it’s a Python module (“library”, call it whatever) that basically makes it really easy to make API requests. great tools! for humans to use! pretty sweet. requests is great

Slide 29

Slide 29 text

get ready to open your text editor and start a Python program. it’s time to get some data from 8tracks. ….after we talk about APIs a little more. NOW YOU’RE ALL SET

Slide 30

Slide 30 text

DUN DUN DUN (yay)

Slide 31

Slide 31 text

an API request URL is a URL that describes where data is. making an API request means using that URL to get data of some kind. anatomy of an API request

Slide 32

Slide 32 text

an API request is made up of a few parts. [the domain/base url] - this you just get, e.g. https://8tracks.com/developers/api_v3 [parameters] - api documentation explains what you need here, what they should look like, and whether they’re optional anatomy of an API request

Slide 33

Slide 33 text

also, [separator] - between the domain/base url and the parameters, you need a ‘?’ character, usually url parameters and parameter values -depend on what data you want to get! -the documentation will tell you what your options are. -many of them are really simple. more on the anatomy of an API request

Slide 34

Slide 34 text

so again, the pieces of a request URI: baseurl -> ‘where it’s ALL stored’ ? -> to separate the base from parameters parameters -> ‘specifically, I want this stuff’ anatomy of an API request

Slide 35

Slide 35 text

when you put an API request URL in a web browser, you’re basically saying: ‘this URL describes a place on the internet with data’, and ‘I want to see that data in plain text in my web browser’ anatomy of an API request

Slide 36

Slide 36 text

when you use an API request URL in a program, you’re basically saying: ‘this URL describes a place on the internet with data’, and ‘I want to GET that data so I can do stuff with it using Python, on my computer’ anatomy of an API request

Slide 37

Slide 37 text

The APIs we’re talking about are called REST APIs. That stands for Representational State Transfer. This is a big(ish) subject, but basically, We care about 1 thing related to this right now. RESTful API services

Slide 38

Slide 38 text

API endpoints are websites for programs. No buttons or pretty things (the way most people talk about pretty things) because they’re not for people to consume directly. They’re for programs. RESTful API services

Slide 39

Slide 39 text

1. build a string that is the correct URL which specifies where the data you want is 2. use a Python library to say ‘hey, go get the data that’s 3. use a Python library to say ‘hey, let’s make this stuff into a Python thingy so I can easily get data in my program’ process of MAKING A REQUEST

Slide 40

Slide 40 text

# basically import requests url = http:// … # put baseurl here params = {} # add stuff to dict resp = requests.get(url, params) python_resp = resp.json() # dict let’s write some code now!