Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HTTP: Digested (ConFoo 2010)

HTTP: Digested (ConFoo 2010)

We're web developers. Almost all the work we do concerns making requests and sending responses over the Web. Yet, how often do we really stop to consider the Web's protocol as part of our daily work? Still, we manipulate that protocol every day, whether we know it or not. Knowing this protocol and how it works can make us better web programmers.

Hypertext Transfer Protocol (RFC 2616), or HTTP, is the protocol of the Web. In this in-depth tutorial, Ben Ramsey will address methods and status codes, success responses, error responses, redirection, content negotiation, caching, and authentication, all with an emphasis on following HTTP semantics in a RESTful fashion. Ben will also demonstrate tools for manipulating and testing HTTP, illustrate the use of the pecl/pecl_http extension for PHP, and discuss browser support for HTTP functionality.

Ben Ramsey

March 12, 2010
Tweet

More Decks by Ben Ramsey

Other Decks in Programming

Transcript

  1. • A client-server architecture • Atomic • Cacheable • A

    uniform interface • Layered • Code on demand
  2. 2 User is redirected to a login page 
 where

    they are prompted to 
 increase their authorization level.
  3. GET /protected/content/1234 HTTP/1.1 Host: example.org HTTP/1.1 302 Found Date: Tue,

    05 Nov 2009 17:34:24 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: /login Content-Length: 0 Content-Type: text/html; charset=utf-8
  4. Language extensions and libraries • Ruby: net/http • Python: httplib

    (http.client in 3.0) • Java: java.net.HttpURLConnection • .NET: ??? • PHP: cURL, fopen wrappers, sockets, pecl/pecl_http, header()
  5. GET • You know GET • Retrieval of information •

    Think about it as copy operation • Copies a representation of a resource from the server to the client • Safe & idempotent
  6. GET /user/ramsey HTTP/1.1 Host: atom.example.org HTTP/1.1 200 OK Date: Tue,

    22 Sep 2009 17:28:14 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 594 Content-Type: application/atom+xml;type=entry <?xml version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  7. POST • You know POST • The body content should

    be accepted as a new subordinate of the resource • Think about it as a paste after operation • Transfers a representation of the resource from the client to the server, pasting it after the resource on the server • Not safe or idempotent
  8. POST /user HTTP/1.1 Host: atom.example.org Content-Type: application/atom+xml;type=entry Content-Length: 474 <?xml

    version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  9. HTTP/1.1 201 Created Date: Tue, 22 Sep 2009 17:39:06 GMT

    Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: http://atom.example.org/user/ramsey Content-Length: 133 Content-Type: text/html; charset=utf-8 <div> The content was created at the location <a href="/user/ramsey"> http://atom.example.org/user/ramsey </a> </div>
  10. HEAD • Identical to GET, except… • Returns only the

    headers, not the body • Useful for getting details about a resource representation before retrieving the full representation • Safe & idempotent
  11. HEAD /content/1234.mp4 HTTP/1.1 Host: atom.example.org HTTP/1.1 200 OK Date: Tue,

    22 Sep 2009 17:28:14 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 12334753 Content-Type: application/mp4
  12. PUT • Opposite of GET • Storage of information •

    Think of it as a paste over operation • Transfers a representation of a resource from the client to the server and pastes over the resource that is on the server • Not safe • Idempotent
  13. PUT /user/ramsey HTTP/1.1 Host: atom.example.org Content-Type: application/atom+xml;type=entry Content-Length: 594 <?xml

    version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  14. DELETE • Think of it as a cut operation •

    Requests that the resource identified be cut (removed from public access) • Not safe • Idempotent
  15. DELETE /content/1234 HTTP/1.1 Host: example.org HTTP/1.1 204 No Content Date:

    Tue, 22 Sep 2009 18:06:37 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 0
  16. Safe methods • GET & HEAD should not take action

    other than retrieval • These are considered safe • Allows agents to represent POST, PUT, & DELETE in a special way
  17. Idempotence • Side-effects of N > 0 identical requests is

    the same as for a single request • GET, HEAD, PUT and DELETE share this property • OPTIONS and TRACE are inherently idempotent
  18. 2 HTTP/1.x 201 Created Date: Thu, 21 May 2009 23:05:34

    GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 120 Content-Type: text/html Location: http://example.org/content/videos/1234 <html><body><p>Video uploaded! Go <a href="http:// example.org/content/videos/1234">here</a> to see it.</p></body></html>
  19. 2 HTTP/1.x 202 Accepted Date: Thu, 21 May 2009 23:05:34

    GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 137 Content-Type: text/html Location: http://example.org/content/videos/1234/status <html><body><p>Video processing! Check <a href="http:// example.org/content/videos/1234/status">here</a> for the status.</p></body></html>
  20. • Used when requests are made for ranges of bytes

    from a resource • Determine whether a server supports range requests by checking for the Accept-Ranges header with HEAD
  21. 2 HTTP/1.0 200 OK Date: Mon, 05 May 2008 00:33:14

    GMT Server: Apache/2.0.52 (Red Hat) Accept-Ranges: bytes Content-Length: 3980 Content-Type: image/jpeg
  22. 4 HTTP/1.0 206 Partial Content Date: Mon, 05 May 2008

    00:36:57 GMT Server: Apache/2.0.52 (Red Hat) Accept-Ranges: bytes Content-Length: 1000 Content-Range: bytes 0-999/3980 Content-Type: image/jpeg {binary data}
  23. • 303 See Other • The response to your request

    can be found at another URL identified by the Location header • The client should make a GET request on that URL • The Location is not a substitute for this URL
  24. 2 HTTP/1.1 303 See Other Date: Tue, 22 Sep 2009

    23:41:33 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: http://example.org/thankyou Content-Length: 0
  25. • 307 Temporary Redirect • The resource resides temporarily at

    the URL identified by the Location • The Location may change, so don’t update your links • If the request is not GET or HEAD, then you must allow the user to confirm the action
  26. • 301 Moved Permanently • The resource has moved permanently

    to the URL indicated by the Location header • You should update your links accordingly • Great for forcing search engines, etc. to index the new URL instead of this one
  27. • 302 Found • The resource has been found at

    another URL identified by the Location header • The new URL might be temporary, so the client should continue to use this URL • Redirections SHOULD be confirmed by the user (in practice, browsers don’t respect this)
  28. 2 User is redirected to a login page where they

    are prompted to increase their authorization level. HTTP/1.1 302 Found Date: Tue, 05 Nov 2009 17:34:24 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: /login Content-Length: 0 Content-Type: text/html; charset=utf-8 Remember this?
  29. A more semantic way 2 HTTP/1.1 401 Unauthorized Date: Tue,

    05 Nov 2009 18:31:33 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 WWW-Authenticate: HTML form="login" Content-Length: 421 Content-Type: text/html; charset=utf-8
  30. <!doctype html> <html> <head> <title>You must log in</title> </head> <body>

    <form name="login" method="post" action="/login"> <label for="username">Username</label> <input type="text" name="username" id="username" /> <label for="password">Password</label> <input type="text" name="password" id="password" /> <input type="submit" value="Login" /> </form> </body> </html> A more semantic way
  31. • Doesn’t imply the resource exists at another location •

    Tells clients the resource requires authorization • Clearly tells crawlers they can’t access the resource • Was originally in HTML5:
 http://blog.whatwg.org/this-week-in- html-5-episode-14 • No longer in HTML5, but it works
  32. • Defined in RFC 2109 and RFC 2965 • Most

    clients just follow the old Netscape specification • To set, the server sends a Set-Cookie response header • The client sends the cookie back in the Cookie request header
  33. PHP example: setting a cookie setcookie('foo', 'bar', time() + 3600,

    '/~ramsey/http', 'localhost', false, true); PHP: header('Set-Cookie: foo=bar; expires=Fri, 12- Mar-2010 07:01:21 GMT; path=/~ramsey/http; domain=localhost; httponly'); PHP (also): Set-Cookie: foo=bar; expires=Fri, 12-Mar-2010 07:01:21 GMT; path=/~ramsey/http; domain=localhost; httponly HTTP response header:
  34. PHP example: reading a cookie $clean['foo'] = null; if (passesValidation($_COOKIE['foo']))

    { $clean['foo'] = $_COOKIE['foo']; } PHP: Cookie: foo=bar HTTP request header:
  35. Cache validation • Allows the client to determine whether the

    copy it has is still fresh • If-Match • If-Modified-Since • If-None-Match • If-Range • If-Unmodified-Since
  36. • The server makes a best guess • The client

    may send headers to help the server guess: Accept, Accept-Language, Accept-Encoding, Accept-Charset, and User-Agent • The server can use other factors • Since it’s a guess, the server algorithm to determine this could send a different representation on a second identical request Server-driven negotiation
  37. Agent-driven negotiation • Requires two requests from the client •

    First request results in a response listing available representations either in the headers or in the entity body • Second request is either automatic (client chooses) or manual (user chooses) for the desired representation • First response should be a 300 Multiple Choices response
  38. Ben’s suggested negotiation • Use a single base URI for

    all resource requests, i.e. /user/username • Use the Accept-* headers to perform server-driven negotiation as best you can • Use a 303 See Other response with a Location header to the appropriate representation: /user/username.html
 /user/username.xml
 /user/username.json
  39. • Respond with 401 Unauthorized • Include the WWW-Authenticate header

    • WWW-Authenticate must contain the authentication challenge required by the server • Multiple challenges or multiple WWW- Authenticate headers may be present Resource requires authentication
  40. Authorization request • The request includes the Authorization header •

    The Authorization header includes the requested challenge
  41. Basic authentication POST /content/videos HTTP/1.1 Host: example.org Content-Type: video/mp4 Content-Length:

    115910000 Authorization: Basic bWFkZTp5b3VfbG9vaw== {binary video data}