Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sumana Harihareswara - HTTP Can Do That?!

Sumana Harihareswara - HTTP Can Do That?!

Learn how to get more performance, testability, and flexibility out of your web apps, using features already built into HTTP. I'll walk you through case studies exploring good (and bad) ideas, using Python, your browser, netcat, and other common tools.

https://us.pycon.org/2016/schedule/presentation/1577/

Eec9d25835717f1f1f12a354faf68d87?s=128

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. HTTP Can Do That?! A collection of bad ideas by

    Sumana Harihareswara @brainwane Changeset Consulting
  2. @brainwane HTTP Hypertext Transfer Protocol

  3. @brainwane Diagrams! – Internet Engineering Task Force (IETF) RFC 7230

    Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing
  4. @brainwane HTTP: crash course Client Server

  5. @brainwane Client Server Request HTTP: crash course

  6. @brainwane Client Server Request Response HTTP: crash course

  7. @brainwane An HTTP Message (Request or Response) START-LINE HTTP version

    (1.1) Request method (GET, POST) Response status code (200, 404, 500)
  8. @brainwane An HTTP Message (Request or Response) HEADERS Content­Type Content­Length

    …... START-LINE HTTP version (1.1) Request method (GET, POST) Response status code (200, 404, 500)
  9. @brainwane An HTTP Message (Request or Response) HEADERS Content­Type Content­Length

    …... BODY START-LINE HTTP version (1.1) Request method (GET, POST) Response status code (200, 404, 500)
  10. @brainwane Example Request HEADERS Host: www.sumana.biz Accept: text/html User­Agent: ScraperBot

    BODY START-LINE GET / HTTP/1.1
  11. @brainwane Example Response HEADERS Content­Type: text/html Content­Length: 203 Date: Tue,

    16 Jun 2015 16:21:56 GMT Last­Modified: Tue, 16 Jun 2015 13:27:14 GMT BODY <html> <head> <title>Welcome to Sumanaville</title> </head> <body><center><h1>Ro ckin'</h1> <p>This is a pretty START-LINE HTTP/1.1 200 OK
  12. @brainwane Methods

  13. @brainwane • GET gimme • POST here you go Popular

    request methods (“verbs”)
  14. @brainwane First bad idea: POST but not GET more: https://gitlab.com/http-can-do-that/secureapi

  15. @brainwane POST but not GET: use cases letters to Santa

    Claus
  16. @brainwane POST but not GET: use cases employee suggestion box

  17. @brainwane POST but not GET: use cases extremely moderated blog

    comments
  18. @brainwane (a logistical note)

  19. @brainwane

  20. @brainwane

  21. @brainwane Bad Idea Scale

  22. @brainwane Giving client no way to GET – bad idea

  23. @brainwane Remember “CRUD”? Create Read Delete Update

  24. @brainwane Remember “CRUD”? Create POST Read GET Delete POST Update

    POST
  25. @brainwane Remember “CRUD”? Create POST Read GET Delete POST Update

    POST
  26. @brainwane Remember “CRUD”? Create POST Read GET Delete POST Update

    POST INELEGANT! INELEGANT!
  27. @brainwane DELETE delete a resource! Underappreciated methods

  28. @brainwane Implementing DELETE

  29. @brainwane Implementing DELETE

  30. @brainwane Implementing DELETE

  31. @brainwane Implementing DELETE

  32. @brainwane Implementing DELETE

  33. @brainwane Implementing DELETE

  34. @brainwane Implementing DELETE

  35. @brainwane Implementing DELETE

  36. @brainwane Implementing DELETE

  37. @brainwane Implementing DELETE

  38. @brainwane DELETE – good idea?

  39. @brainwane PUT “here you go” Underappreciated methods

  40. @brainwane I thought POST meant “here you go” Wait

  41. @brainwane So what is POST, anyway? The standard says it

    means: “Above our pay grade; take this to the boss” a.k.a. Overloaded POST
  42. @brainwane So what is POST, anyway? Often, we use it

    for: “Create a new item in this set” a.k.a. POST-to-append
  43. @brainwane

  44. @brainwane PUT vs. POST PUT /cards/5 Body: Means: “Put this

    picture at /cards/5 .” POST /cards/5 Body: Means: “Tell the webapp that this picture applies to /cards/5 somehow – figure it out.”
  45. @brainwane “CRUD” & HTTP verbs Create PUT Read GET Delete

    DELETE Update PUT
  46. @brainwane PUT – good idea?

  47. @brainwane • PATCH update just part of this document/resource More

    underused methods
  48. @brainwane PATCH – good idea?

  49. @brainwane • PATCH update just part of this document/resource •

    OPTIONS ask what verbs the client’s allowed to use (for a specific path, or server-wide) More underused methods
  50. @brainwane OPTIONS – good idea?

  51. @brainwane HEAD like GET, but just for metadata A super-cool

    method
  52. @brainwane GET vs. HEAD Request: GET / HTTP/1.1 Response: •

    Start-line • Headers • Body Request: HEAD / HTTP/1.1 Response: • Start-line • Headers
  53. @brainwane HEAD saves time

  54. @brainwane HEAD saves time

  55. @brainwane HEAD saves time

  56. @brainwane You don’t need the body to check: Does it

    exist? Do I have permission to GET it? Content­Length Last­Modified Content­Type ETag Retry­After
  57. @brainwane HEAD – good idea?

  58. @brainwane Headers

  59. @brainwane Popular headers include: Content­Type Content­Length

  60. @brainwane Popular headers include: Content­Type Content­Length Also known as MIME

    or Mime
  61. @brainwane Popular headers include: Content­Type Content­Length text/*

  62. @brainwane Popular headers include: Content­Type Content­Length application/*

  63. @brainwane Popular headers include: Content­Type Content­Length chemical/*

  64. @brainwane Popular headers include: Content­Encoding Accept­Encoding Content­Language Accept­Language

  65. @brainwane More headers ETag If­Match If­None­Match

  66. @brainwane More headers If­Modified­Since If­Unmodified­Since Last­Modified Cache­Control

  67. @brainwane A popular header User­Agent

  68. @brainwane An unpopular header From The email address of the

    person making the request
  69. @brainwane Uses for From Really bad auth

  70. @brainwane Uses for From “Yes, I saw your site launch”

  71. @brainwane Uses for From Coded messages meant for network surveillor

  72. @brainwane From – bad idea

  73. @brainwane Another spy trick “Each header field consists of a

    case-insensitive field name followed by a colon (":")...” So: vary the case of the headers you send!!! – Internet Engineering Task Force (IETF) RFC 7230 Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing
  74. @brainwane Header casing as secret info channel – bad idea

  75. @brainwane A popular header Host

  76. @brainwane A required header Host required in request messages

  77. @brainwane $ netcat myhostname.tld 80 GET /bicycle HTTP/1.1 Host: myhostname.tld

    Host & path work together
  78. @brainwane Host & path work together

  79. @brainwane Host & path work together

  80. @brainwane Host & path work together

  81. @brainwane Host & path work together

  82. @brainwane A popular header Host (wait – why do we

    need to repeat this? It's in the URL! right?)
  83. @brainwane How Host helps HTTP is separate from the Domain

    Name System
  84. @brainwane How Host helps Host helps route requests among different

    domains that sit on the same server
  85. @brainwane Examples of virtual hosts www.debian.org

  86. @brainwane Examples of virtual hosts bugs.debian.org

  87. @brainwane Examples of virtual hosts lists.debian.org

  88. @brainwane Examples of virtual hosts wiki.debian.org

  89. @brainwane But watch out...

  90. @brainwane But watch out...

  91. @brainwane A spam story

  92. @brainwane A spam story My 404 logs (Drupal admin console):

    TYPE page not found DATE Thursday, October 9, 2014 - 10:46 USER Anonymous (not verified) LOCATION http://myphishingsite.biz/http://myphishingsite.biz REFERRER MESSAGE ttp://myphishingsite.biz SEVERITY warning HOSTNAME [IP address]
  93. @brainwane A spam story My 404 logs (Drupal admin console):

    TYPE page not found DATE Thursday, October 9, 2014 - 10:46 USER Anonymous (not verified) LOCATION http://myphishingsite.biz/http://myphishingsite.biz REFERRER MESSAGE ttp://myphishingsite.biz SEVERITY warning HOSTNAME [IP address]
  94. @brainwane A spam story My access logs: [IP address] ­

    ­ [09/Oct/2014:10:46:09 ­0400] "GET http://myphishingsite.biz HTTP/1.1" 404 7574 "­" [User­Agent]
  95. @brainwane A spam story Legit mistakes would look like: [IP

    address] ­ ­ [09/Oct/2014:10:46:09 ­0400] "GET /http://berkeley.edu HTTP/1.1" 404 7574 "­" [User­Agent]
  96. @brainwane A spam story Intentionally malform your request! $ netcat

    myhostname.tld 80 GET http://spam.com HTTP/1.1 Host: spam.com
  97. @brainwane A spam story Intentionally malform your request! $ netcat

    myhostname.tld 80 GET /viagra­bitcoin HTTP/1.1 Host: spam.com
  98. @brainwane 404 spamming – bad idea

  99. @brainwane Define your own header! “Header fields are fully extensible:

    there is no limit on the introduction of new field names, each presumably defining new semantics, nor on the number of header fields used in a given message.” -(RFC 7230)
  100. @brainwane Define your own header! X­blah­blah­blah

  101. @brainwane Define your own header! X­Wikimedia­Debug

  102. @brainwane Define your own header! X­Wikimedia­Debug an HTTP request header

    • Backend selection (Varnish) • Caching behavior • Request profiling (record a trace) • Debug logs • Read-only mode • Browser extensions More: https://wikitech.wikimedia.org/wiki/X- Wikimedia-Debug
  103. @brainwane Define your own header!

  104. @brainwane Define your own header!

  105. @brainwane Define your own header!

  106. @brainwane Define your own header!

  107. @brainwane Define your own header! https://gitlab.com/http-can-do-that/novel-titles

  108. @brainwane Define your own header!

  109. @brainwane Defining your own headers – good idea?

  110. @brainwane Status codes

  111. @brainwane Status codes 100 & 101: Informational 2xx: Successful 3xx:

    Redirection 4xx: Client error 5xx: Server error
  112. @brainwane Status (response) codes 404 Not Found Code Reason-phrase

  113. @brainwane Status (response) codes “A client SHOULD ignore the reason-phrase

    content.”
  114. @brainwane Heard of these? • 410 Gone It was here,

    but now it’s not. • 304 Not Modified You said, ‘GET this, if it’s been modified since [date]’. It hasn’t been.
  115. @brainwane 451 Unavailable For Legal Reasons Server is legally required

    to reject client’s request
  116. @brainwane 451 Unavailable For Legal Reasons Can’t let you see

    that; it’s censored.
  117. @brainwane 451 Unavailable For Legal Reasons “This is considered a

    client-side error even though the request is well formed and the legal requirement exists on the server side. After all, that representation was censored for a reason. There must be something wrong with you, citizen.” -RESTful Web APIs, Leonard Richardson & Mike Amundsen
  118. @brainwane 451 Unavailable For Legal Reasons

  119. @brainwane 451 – good idea?

  120. @brainwane WTF responses All of these were found in the

    wild
  121. @brainwane WTF responses Code: 126 Reason: Incorrect key file for

    table '/tmp/mysqltmp/#sql_13fb_2.MYI'; try to repair it SQL=SHOW FULL COLUMNS FROM `y4dnu_extensions`
  122. @brainwane WTF responses Code: 301 Reason: explicit_header_response_code

  123. @brainwane WTF responses Code: 403 Reason: You've got to ask

    yourself one question: Do I feel lucky?
  124. @brainwane WTF responses Code: 403 Reason: can't put wasabi in

    bed
  125. @brainwane WTF responses Code: 404 Reason: HTTP/1.1 404

  126. @brainwane WTF responses Code: 404 Reason: Not Found"); ?><?php Header("cache-

    Control: no-store, no-cache, must-revalidate
  127. @brainwane WTF responses Code: 200 Reason: Forbidden

  128. @brainwane WTF responses Code: 404 Reason: Apple WebObjects

  129. @brainwane WTF responses Code: 404 Reason: forbidden

  130. @brainwane WTF responses Code: 434 Reason: HTTP/1.1 434

  131. @brainwane WTF responses Code: 451 Reason: Unknown Reason-Phrase

  132. @brainwane WTF responses Code: 503 Reason: Backend is unhealthy

  133. @brainwane WTF responses Code: 520 Reason: Origin Error

  134. @brainwane WTF responses Code: 525 Reason: Origin SSL Handshake Error

  135. @brainwane WTF responses Code: 533 Reason: mtd::http: Unknown: Banned

  136. @brainwane WTF responses Code: 732 Reason: http://www.[hostname].com/intro/copyright.php

  137. @brainwane WTF responses Code: 999 Reason: Request denied

  138. @brainwane Changing Reason-phrases more at https://gitlab.com/http-can-do-that

  139. @brainwane

  140. @brainwane

  141. @brainwane Bespoke status codes/reasons – good idea?

  142. @brainwane Conclusion

  143. @brainwane There’s so much more • “Don’t cache this” •

    Pragma – pass instructions to server/client • CONNECT, TRACE, LINK, & UNLINK methods • 409 Conflict • Look-before-you-leap requests • Resources at HTTPS vs. HTTP URLs can differ • “q” and preference ranking in the Accept header • Content-Disposition (e.g. “attachment”)
  144. @brainwane The feeling of power The sense of wonder

  145. @brainwane What might the web have been? What might it

    still be?
  146. @brainwane Read & play • RFCs 7230-7235 • requests •

    netcat, wget, netstat, telnet • basic HTTP servers (in your favorite language) • https://gitlab.com/http-can-do-that
  147. @brainwane Thanks Leonard Richardson Greg Hendershott Zack Weinberg The Recurse

    Center Clay Hallock Paul Tagliamonte Open Source Bridge Julia Evans, Allison Kaptur, Amy Hanlon, and Katie Silverio
  148. Thank you Sumana Harihareswara http://changeset.nyc @brainwane https://gitlab.com/http-can-do-that sumanah@panix.com