Pro Yearly is on sale from $80 to $50! »

I am doing HTTP wrong

I am doing HTTP wrong

A fresh look at HTTP for agile languages (more importantly: Python)

181de1fb11dffe39774f3e2e23cda3b6?s=128

Armin Ronacher

May 13, 2012
Tweet

Transcript

  1. I am doing HTTP wrong — a presentation by Armin

    Ronacher @mitsuhiko
  2. The Web developer's Evolution

  3. echo

  4. request.send_header(…) request.end_headers() request.write(…)

  5. return Response(…)

  6. Why Stop there?

  7. What do we love about HTTP?

  8. Text Based

  9. REST

  10. Cacheable

  11. Content Negotiation

  12. Well Supported

  13. Works where TCP doesn't

  14. Somewhat Simple

  15. Upgrades to custom protocols

  16. Why does my application look like HTTP?

  17. everybody does it

  18. Natural Conclusion

  19. we can do better!

  20. we're a level too low

  21. Streaming: one piece at the time, constant memory usage, no

    seeking.
  22. Buffering: have some data in memory, variable memory usage, seeking.

  23. TYPICAL Request / Response Cycle User Agent Proxy Server Application

    Stream “Buffered” Dispatcher View
  24. In Python Terms def application(environ, start_response): # Step 1: acquire

    data data = environ['wsgi.input'].read(...) # Step 2: process data response = process_data(data) # Step 3: respond start_response('200 OK', [('Content-Type', 'text/plain')]) return [response]
  25. One Level Up s = socket.accept() f = s.makefile('rb') requestline

    = f.readline() headers = [] while 1: headerline = f.readline() if headerline == '\r\n': break headers.append(headerline)
  26. Weird Mixture on the app request.headers <- buffered request.form <-

    buffered request.files <- buffered to disk request.body <- streamed
  27. HTTP's Limited signalling Strict Request / Response The only communication

    during request from the server to the client is closing the connection once you started accepting the body.
  28. Bailing out early def application(request): # At this point, headers

    are parsed, everything else # is not parsed yet. if request.content_length > TWO_MEGABYTES: return error_response() ...
  29. Bailing out a little bit later def application(request): # Read

    a little bit of data request.input.read(4096) # You just committed to accepting data, now you have to # read everything or the browser will be very unhappy and # Just time out. No more responding with 413 ...
  30. Rejecting Form fields -> memory File uploads -> disk What's

    your limit? 16MB in total? All could go to memory. Reject file sizes individually? Needs overall check as well!
  31. The Consequences How much data do you accept? Limit the

    overall request size? Not helpful because all of it could be in-memory
  32. It's not just limiting Consider a layered system How many

    of you write code that streams? What happens if you pass streamed data through your layers?
  33. A new approach

  34. Dynamic typing made us lazy

  35. we're trying to solve both use cases in one we're

    not supporting either well
  36. How we do it Hide HTTP from the apps HTTP

    is an implementation detail
  37. Pseudocode user_pagination = make_pagination_schema(User) @export( specs=[('page', types.Int32()), ('per_page', types.Int32())], returns=user_pagination,

    semantics='select', http_path='/users/' ) def list_users(page, per_page): users = User.query.paginate(page, per_page) return users.to_dict()
  38. Types are specific user_type = types.Object([ ('username', types.String(30)), ('email', types.Optional(types.String(250))),

    ('password_hash', types.String(250)), ('is_active', types.Boolean()), ('registration_date', types.DateTime()) ])
  39. Why? Support for different input/output formats keyless transport support for

    non-HTTP no hash collision attacks :-) Predictable memory usage
  40. Comes for free Easier to test Helps documenting the public

    APIs Catches common errors early Handle errors without invoking code Predictable dictionary ordering
  41. Strict vs Lenient

  42. Rule of Thumb Be strict in what you send, but

    generous in what you receive — variant of Postel's Law
  43. Being Generous In order to be generous you need to

    know what to receive. Just accepting any input is a security disaster waiting to happen.
  44. Support unsupported types { "foo": [1, 2, 3], "bar": {"key":

    "value"}, "now": "Thu, 10 May 2012 14:16:09 GMT" } foo.0=1& foo.1=2& foo.2=3& bar.key=value& now=Thu%2C%2010%20May%202012%2014:16:09%20GMT
  45. Solves the GET issue GET has no body parameters have

    to be URL encoded inconsistency with JSON post requests
  46. Where is the streaming?

  47. There is none

  48. there are always two sides to an API

  49. If the server has streaming endpoints — the client will

    have to support them as well
  50. For things that need actual streaming we have separate endpoints.

  51. streaming is different

  52. but we can stream until we need buffering

  53. Discard useless stuff { "foo": [list, of, thousands, of, items,

    we don't, need], "an_important_key": "we're actually interested in" }
  54. What if I don't make an API?

  55. modern web apps are APIs

  56. Dumb client? Move the client to the server

  57. Q&A

  58. Oh hai. We're hiring http://fireteam.net/careers