HTTP: Digested (ConFoo 2010)

HTTP: Digested (ConFoo 2010)

We're web developers. Almost all the work we do concerns making requests and sending responses over the Web. Yet, how often do we really stop to consider the Web's protocol as part of our daily work? Still, we manipulate that protocol every day, whether we know it or not. Knowing this protocol and how it works can make us better web programmers.

Hypertext Transfer Protocol (RFC 2616), or HTTP, is the protocol of the Web. In this in-depth tutorial, Ben Ramsey will address methods and status codes, success responses, error responses, redirection, content negotiation, caching, and authentication, all with an emphasis on following HTTP semantics in a RESTful fashion. Ben will also demonstrate tools for manipulating and testing HTTP, illustrate the use of the pecl/pecl_http extension for PHP, and discuss browser support for HTTP functionality.

0c217b9a7dd0aa31ed40bd0f453727e1?s=128

Ben Ramsey

March 12, 2010
Tweet

Transcript

  1. HTTP: Digested Part 1 Ben Ramsey 12 March 2010

  2. Hi, I’m Ben …

  3. Why HTTP?

  4. Because you are a web developer.

  5. HTTP is the Web.

  6. • HTTP Basics • Advanced HTTP

  7. HTTP Basics

  8. Some properties of HTTP …

  9. • A client-server architecture • Atomic • Cacheable • A

    uniform interface • Layered • Code on demand
  10. Now, what does that sound like?

  11. REST!

  12. First, a word about semantics.

  13. 1 User requests a page above their authorization level.

  14. 2 User is redirected to a login page 
 where

    they are prompted to 
 increase their authorization level.
  15. GET /protected/content/1234 HTTP/1.1 Host: example.org HTTP/1.1 302 Found Date: Tue,

    05 Nov 2009 17:34:24 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: /login Content-Length: 0 Content-Type: text/html; charset=utf-8
  16. The resource requested is found at another location?

  17. The semantics are all wrong.

  18. Tools

  19. Language extensions and libraries • Ruby: net/http • Python: httplib

    (http.client in 3.0) • Java: java.net.HttpURLConnection • .NET: ??? • PHP: cURL, fopen wrappers, sockets, pecl/pecl_http, header()
  20. HTTP Inspectors: FireBug

  21. HTTP Inspectors: Chrome

  22. HTTP Inspectors: Charles

  23. Telnet

  24. HTTP Methods

  25. GET • You know GET • Retrieval of information •

    Think about it as copy operation • Copies a representation of a resource from the server to the client • Safe & idempotent
  26. GET /user/ramsey HTTP/1.1 Host: atom.example.org HTTP/1.1 200 OK Date: Tue,

    22 Sep 2009 17:28:14 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 594 Content-Type: application/atom+xml;type=entry <?xml version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  27. POST • You know POST • The body content should

    be accepted as a new subordinate of the resource • Think about it as a paste after operation • Transfers a representation of the resource from the client to the server, pasting it after the resource on the server • Not safe or idempotent
  28. POST /user HTTP/1.1 Host: atom.example.org Content-Type: application/atom+xml;type=entry Content-Length: 474 <?xml

    version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  29. HTTP/1.1 201 Created Date: Tue, 22 Sep 2009 17:39:06 GMT

    Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: http://atom.example.org/user/ramsey Content-Length: 133 Content-Type: text/html; charset=utf-8 <div> The content was created at the location <a href="/user/ramsey"> http://atom.example.org/user/ramsey </a> </div>
  30. HEAD • Identical to GET, except… • Returns only the

    headers, not the body • Useful for getting details about a resource representation before retrieving the full representation • Safe & idempotent
  31. HEAD /content/1234.mp4 HTTP/1.1 Host: atom.example.org HTTP/1.1 200 OK Date: Tue,

    22 Sep 2009 17:28:14 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 12334753 Content-Type: application/mp4
  32. PUT • Opposite of GET • Storage of information •

    Think of it as a paste over operation • Transfers a representation of a resource from the client to the server and pastes over the resource that is on the server • Not safe • Idempotent
  33. PUT /user/ramsey HTTP/1.1 Host: atom.example.org Content-Type: application/atom+xml;type=entry Content-Length: 594 <?xml

    version="1.0" encoding="utf-8"?> <entry xmlns="http://www.w3.org/2005/Atom" xml:base="http://atom.example.org/"> <title>ramsey</title> ... </entry>
  34. DELETE • Think of it as a cut operation •

    Requests that the resource identified be cut (removed from public access) • Not safe • Idempotent
  35. DELETE /content/1234 HTTP/1.1 Host: example.org HTTP/1.1 204 No Content Date:

    Tue, 22 Sep 2009 18:06:37 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 0
  36. What the hell are safe & idempotent methods?

  37. Safe methods • GET & HEAD should not take action

    other than retrieval • These are considered safe • Allows agents to represent POST, PUT, & DELETE in a special way
  38. Idempotence • Side-effects of N > 0 identical requests is

    the same as for a single request • GET, HEAD, PUT and DELETE share this property • OPTIONS and TRACE are inherently idempotent
  39. HTTP Status Codes

  40. • Informational (1xx) • Successful (2xx) • Redirection (3xx) •

    Client error (4xx) • Server error (5xx)
  41. You’re familiar with 200, 404, and 302.

  42. Advanced HTTP

  43. The created at another location response

  44. 1 POST /content/videos HTTP/1.1 Host: example.org Content-Type: video/mp4 Content-Length: 115910000

    Authorization: Basic bWFkZTp5b3VfbG9vaw== {binary video data}
  45. 2 HTTP/1.x 201 Created Date: Thu, 21 May 2009 23:05:34

    GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 120 Content-Type: text/html Location: http://example.org/content/videos/1234 <html><body><p>Video uploaded! Go <a href="http:// example.org/content/videos/1234">here</a> to see it.</p></body></html>
  46. The “it’s not you it’s me” response

  47. i.e. I’ve accepted it but might have to do more

    processing
  48. 2 HTTP/1.x 202 Accepted Date: Thu, 21 May 2009 23:05:34

    GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Content-Length: 137 Content-Type: text/html Location: http://example.org/content/videos/1234/status <html><body><p>Video processing! Check <a href="http:// example.org/content/videos/1234/status">here</a> for the status.</p></body></html>
  49. The “I have nothing to say to you” response …

  50. … but you were still successful

  51. 1 DELETE /content/videos/1234 HTTP/1.1 Host: example.org Authorization: Basic bWFkZTp5b3VfbG9vaw==

  52. 2 HTTP/1.x 204 No Content Date: Thu, 21 May 2009

    23:28:34 GMT
  53. The ranged request

  54. • Used when requests are made for ranges of bytes

    from a resource • Determine whether a server supports range requests by checking for the Accept-Ranges header with HEAD
  55. 1 HEAD /2390/2253727548_a413c88ab3_s.jpg HTTP/1.1 Host: farm3.static.flickr.com

  56. 2 HTTP/1.0 200 OK Date: Mon, 05 May 2008 00:33:14

    GMT Server: Apache/2.0.52 (Red Hat) Accept-Ranges: bytes Content-Length: 3980 Content-Type: image/jpeg
  57. 3 GET /2390/2253727548_a413c88ab3_s.jpg HTTP/1.1 Host: farm3.static.flickr.com Range: bytes=0-999

  58. 4 HTTP/1.0 206 Partial Content Date: Mon, 05 May 2008

    00:36:57 GMT Server: Apache/2.0.52 (Red Hat) Accept-Ranges: bytes Content-Length: 1000 Content-Range: bytes 0-999/3980 Content-Type: image/jpeg {binary data}
  59. End of Part 1

  60. Please, turn the record over to listen to side B.

  61. Thank you! Ben Ramsey benramsey.com Twitter: @ramsey Rate this talk:

    joind.in/1395
  62. HTTP: Digested Part 2 Ben Ramsey 12 March 2010

  63. Picking up where we left off …

  64. The GET me from another location response

  65. • 303 See Other • The response to your request

    can be found at another URL identified by the Location header • The client should make a GET request on that URL • The Location is not a substitute for this URL
  66. 1 POST /contact HTTP/1.1 Host: example.org Content-Type: application/x-www-form-urlencoded Content-Length: 1234

    {url-encoded form values from a contact form}
  67. 2 HTTP/1.1 303 See Other Date: Tue, 22 Sep 2009

    23:41:33 GMT Server: Apache/2.2.11 (Unix) DAV/2 PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: http://example.org/thankyou Content-Length: 0
  68. The find me temporarily at this place response

  69. • 307 Temporary Redirect • The resource resides temporarily at

    the URL identified by the Location • The Location may change, so don’t update your links • If the request is not GET or HEAD, then you must allow the user to confirm the action
  70. The permanent forwarding address response

  71. • 301 Moved Permanently • The resource has moved permanently

    to the URL indicated by the Location header • You should update your links accordingly • Great for forcing search engines, etc. to index the new URL instead of this one
  72. But what about just finding the resource at another location?

  73. • 302 Found • The resource has been found at

    another URL identified by the Location header • The new URL might be temporary, so the client should continue to use this URL • Redirections SHOULD be confirmed by the user (in practice, browsers don’t respect this)
  74. The login required response

  75. GET /protected/content/1234 HTTP/1.1 Host: example.org Remember this? 1 User requests

    a page above their authorization level.
  76. 2 User is redirected to a login page where they

    are prompted to increase their authorization level. HTTP/1.1 302 Found Date: Tue, 05 Nov 2009 17:34:24 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 Location: /login Content-Length: 0 Content-Type: text/html; charset=utf-8 Remember this?
  77. A more semantic way 1 GET /protected/content/1234 HTTP/1.1 Host: example.org

  78. A more semantic way 2 HTTP/1.1 401 Unauthorized Date: Tue,

    05 Nov 2009 18:31:33 GMT Server: Apache/2.2.14 (Unix) PHP/5.3.0 X-Powered-By: PHP/5.3.0 WWW-Authenticate: HTML form="login" Content-Length: 421 Content-Type: text/html; charset=utf-8
  79. <!doctype html> <html> <head> <title>You must log in</title> </head> <body>

    <form name="login" method="post" action="/login"> <label for="username">Username</label> <input type="text" name="username" id="username" /> <label for="password">Password</label> <input type="text" name="password" id="password" /> <input type="submit" value="Login" /> </form> </body> </html> A more semantic way
  80. • Doesn’t imply the resource exists at another location •

    Tells clients the resource requires authorization • Clearly tells crawlers they can’t access the resource • Was originally in HTML5:
 http://blog.whatwg.org/this-week-in- html-5-episode-14 • No longer in HTML5, but it works
  81. Cookies

  82. • Defined in RFC 2109 and RFC 2965 • Most

    clients just follow the old Netscape specification • To set, the server sends a Set-Cookie response header • The client sends the cookie back in the Cookie request header
  83. PHP example: setting a cookie setcookie('foo', 'bar', time() + 3600,

    '/~ramsey/http', 'localhost', false, true); PHP: header('Set-Cookie: foo=bar; expires=Fri, 12- Mar-2010 07:01:21 GMT; path=/~ramsey/http; domain=localhost; httponly'); PHP (also): Set-Cookie: foo=bar; expires=Fri, 12-Mar-2010 07:01:21 GMT; path=/~ramsey/http; domain=localhost; httponly HTTP response header:
  84. PHP example: reading a cookie $clean['foo'] = null; if (passesValidation($_COOKIE['foo']))

    { $clean['foo'] = $_COOKIE['foo']; } PHP: Cookie: foo=bar HTTP request header:
  85. Caching

  86. • Cache expiration • Cache validation

  87. • Cache-Control response header • Tells the client when the

    content expires Cache expiration
  88. Cache validation • Allows the client to determine whether the

    copy it has is still fresh • If-Match • If-Modified-Since • If-None-Match • If-Range • If-Unmodified-Since
  89. Content Negotiation

  90. • Server-driven negotiation • Agent-driven negotiation

  91. • The server makes a best guess • The client

    may send headers to help the server guess: Accept, Accept-Language, Accept-Encoding, Accept-Charset, and User-Agent • The server can use other factors • Since it’s a guess, the server algorithm to determine this could send a different representation on a second identical request Server-driven negotiation
  92. Agent-driven negotiation • Requires two requests from the client •

    First request results in a response listing available representations either in the headers or in the entity body • Second request is either automatic (client chooses) or manual (user chooses) for the desired representation • First response should be a 300 Multiple Choices response
  93. Ben’s suggested negotiation • Use a single base URI for

    all resource requests, i.e. /user/username • Use the Accept-* headers to perform server-driven negotiation as best you can • Use a 303 See Other response with a Location header to the appropriate representation: /user/username.html
 /user/username.xml
 /user/username.json
  94. Authentication

  95. • Respond with 401 Unauthorized • Include the WWW-Authenticate header

    • WWW-Authenticate must contain the authentication challenge required by the server • Multiple challenges or multiple WWW- Authenticate headers may be present Resource requires authentication
  96. WWW-Authenticate challenges • Basic • Digest • OAuth • WSSE

    • HTML?
  97. Authorization request • The request includes the Authorization header •

    The Authorization header includes the requested challenge
  98. Basic authentication POST /content/videos HTTP/1.1 Host: example.org Content-Type: video/mp4 Content-Length:

    115910000 Authorization: Basic bWFkZTp5b3VfbG9vaw== {binary video data}
  99. We’ve come  so far …

  100. … yet we have so far left to go.

  101. It’s your turn, now.

  102. Thank you! Ben Ramsey benramsey.com Twitter: @ramsey Rate this talk:

    joind.in/1395