$30 off During Our Annual Pro Sale. View Details »

Sumana Harihareswara - HTTP Can Do That?!

Sumana Harihareswara - HTTP Can Do That?!

Learn how to get more performance, testability, and flexibility out of your web apps, using features already built into HTTP. I'll walk you through case studies exploring good (and bad) ideas, using Python, your browser, netcat, and other common tools.

https://us.pycon.org/2016/schedule/presentation/1577/

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. HTTP Can Do That?!
    A collection of bad ideas
    by Sumana Harihareswara
    @brainwane
    Changeset Consulting

    View Slide

  2. @brainwane
    HTTP
    Hypertext
    Transfer
    Protocol

    View Slide

  3. @brainwane
    Diagrams!
    – Internet Engineering Task Force (IETF) RFC 7230
    Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing

    View Slide

  4. @brainwane
    HTTP: crash course
    Client
    Server

    View Slide

  5. @brainwane
    Client
    Server
    Request
    HTTP: crash course

    View Slide

  6. @brainwane
    Client
    Server
    Request
    Response
    HTTP: crash course

    View Slide

  7. @brainwane
    An HTTP Message
    (Request or Response)
    START-LINE
    HTTP version (1.1)
    Request method (GET, POST)
    Response status code (200, 404, 500)

    View Slide

  8. @brainwane
    An HTTP Message
    (Request or Response)
    HEADERS
    Content­Type
    Content­Length
    …...
    START-LINE
    HTTP version (1.1)
    Request method (GET, POST)
    Response status code (200, 404, 500)

    View Slide

  9. @brainwane
    An HTTP Message
    (Request or Response)
    HEADERS
    Content­Type
    Content­Length
    …...
    BODY
    START-LINE
    HTTP version (1.1)
    Request method (GET, POST)
    Response status code (200, 404, 500)

    View Slide

  10. @brainwane
    Example Request
    HEADERS
    Host: www.sumana.biz
    Accept: text/html
    User­Agent: ScraperBot
    BODY
    START-LINE
    GET / HTTP/1.1

    View Slide

  11. @brainwane
    Example Response
    HEADERS
    Content­Type: text/html
    Content­Length: 203
    Date: Tue, 16 Jun 2015 16:21:56
    GMT
    Last­Modified: Tue, 16 Jun 2015
    13:27:14 GMT
    BODY


    Welcome to
    Sumanaville

    Ro
    ckin'
    This is a pretty
    START-LINE
    HTTP/1.1 200 OK

    View Slide

  12. @brainwane
    Methods

    View Slide

  13. @brainwane

    GET
    gimme

    POST
    here you go
    Popular request methods
    (“verbs”)

    View Slide

  14. @brainwane
    First bad idea:
    POST but not GET
    more:
    https://gitlab.com/http-can-do-that/secureapi

    View Slide

  15. @brainwane
    POST but not GET:
    use cases
    letters to Santa Claus

    View Slide

  16. @brainwane
    POST but not GET:
    use cases
    employee suggestion box

    View Slide

  17. @brainwane
    POST but not GET:
    use cases
    extremely moderated blog comments

    View Slide

  18. @brainwane
    (a logistical note)

    View Slide

  19. @brainwane

    View Slide

  20. @brainwane

    View Slide

  21. @brainwane
    Bad Idea Scale

    View Slide

  22. @brainwane
    Giving client no way to GET –
    bad idea

    View Slide

  23. @brainwane
    Remember “CRUD”?
    Create Read
    Delete
    Update

    View Slide

  24. @brainwane
    Remember “CRUD”?
    Create
    POST
    Read
    GET
    Delete
    POST
    Update
    POST

    View Slide

  25. @brainwane
    Remember “CRUD”?
    Create
    POST
    Read
    GET
    Delete
    POST
    Update
    POST

    View Slide

  26. @brainwane
    Remember “CRUD”?
    Create
    POST
    Read
    GET
    Delete
    POST
    Update
    POST
    INELEGANT!
    INELEGANT!

    View Slide

  27. @brainwane
    DELETE
    delete a resource!
    Underappreciated methods

    View Slide

  28. @brainwane
    Implementing DELETE

    View Slide

  29. @brainwane
    Implementing DELETE

    View Slide

  30. @brainwane
    Implementing DELETE

    View Slide

  31. @brainwane
    Implementing DELETE

    View Slide

  32. @brainwane
    Implementing DELETE

    View Slide

  33. @brainwane
    Implementing DELETE

    View Slide

  34. @brainwane
    Implementing DELETE

    View Slide

  35. @brainwane
    Implementing DELETE

    View Slide

  36. @brainwane
    Implementing DELETE

    View Slide

  37. @brainwane
    Implementing DELETE

    View Slide

  38. @brainwane
    DELETE – good idea?

    View Slide

  39. @brainwane
    PUT
    “here you go”
    Underappreciated methods

    View Slide

  40. @brainwane
    I thought POST meant “here you go”
    Wait

    View Slide

  41. @brainwane
    So what is POST,
    anyway?
    The standard says it means:
    “Above our pay grade; take this to the boss”
    a.k.a. Overloaded POST

    View Slide

  42. @brainwane
    So what is POST,
    anyway?
    Often, we use it for:
    “Create a new item in this set”
    a.k.a. POST-to-append

    View Slide

  43. @brainwane

    View Slide

  44. @brainwane
    PUT vs. POST
    PUT /cards/5
    Body:
    Means:
    “Put this picture at
    /cards/5 .”
    POST /cards/5
    Body:
    Means:
    “Tell the webapp that this
    picture applies to
    /cards/5 somehow –
    figure it out.”

    View Slide

  45. @brainwane
    “CRUD” & HTTP verbs
    Create
    PUT
    Read
    GET
    Delete
    DELETE
    Update
    PUT

    View Slide

  46. @brainwane
    PUT – good idea?

    View Slide

  47. @brainwane

    PATCH
    update just part of this document/resource
    More underused methods

    View Slide

  48. @brainwane
    PATCH – good idea?

    View Slide

  49. @brainwane

    PATCH
    update just part of this document/resource

    OPTIONS
    ask what verbs the client’s allowed to use (for a
    specific path, or server-wide)
    More underused methods

    View Slide

  50. @brainwane
    OPTIONS – good idea?

    View Slide

  51. @brainwane
    HEAD
    like GET, but just for metadata
    A super-cool method

    View Slide

  52. @brainwane
    GET vs. HEAD
    Request:
    GET / HTTP/1.1
    Response:

    Start-line

    Headers

    Body
    Request:
    HEAD / HTTP/1.1
    Response:

    Start-line

    Headers

    View Slide

  53. @brainwane
    HEAD saves time

    View Slide

  54. @brainwane
    HEAD saves time

    View Slide

  55. @brainwane
    HEAD saves time

    View Slide

  56. @brainwane
    You don’t need the body to check:
    Does it exist?
    Do I have permission to GET it?
    Content­Length
    Last­Modified
    Content­Type
    ETag
    Retry­After

    View Slide

  57. @brainwane
    HEAD – good idea?

    View Slide

  58. @brainwane
    Headers

    View Slide

  59. @brainwane
    Popular headers include:
    Content­Type
    Content­Length

    View Slide

  60. @brainwane
    Popular headers include:
    Content­Type
    Content­Length
    Also known as MIME or Mime

    View Slide

  61. @brainwane
    Popular headers include:
    Content­Type
    Content­Length
    text/*

    View Slide

  62. @brainwane
    Popular headers include:
    Content­Type
    Content­Length
    application/*

    View Slide

  63. @brainwane
    Popular headers include:
    Content­Type
    Content­Length
    chemical/*

    View Slide

  64. @brainwane
    Popular headers include:
    Content­Encoding
    Accept­Encoding
    Content­Language
    Accept­Language

    View Slide

  65. @brainwane
    More headers
    ETag
    If­Match
    If­None­Match

    View Slide

  66. @brainwane
    More headers
    If­Modified­Since
    If­Unmodified­Since
    Last­Modified
    Cache­Control

    View Slide

  67. @brainwane
    A popular header
    User­Agent

    View Slide

  68. @brainwane
    An unpopular header
    From
    The email address
    of the person making the request

    View Slide

  69. @brainwane
    Uses for From
    Really bad auth

    View Slide

  70. @brainwane
    Uses for From
    “Yes, I saw your site launch”

    View Slide

  71. @brainwane
    Uses for From
    Coded messages
    meant for network surveillor

    View Slide

  72. @brainwane
    From – bad idea

    View Slide

  73. @brainwane
    Another spy trick
    “Each header field consists of a case-insensitive
    field name followed by a colon (":")...”
    So: vary the case of the headers you send!!!
    – Internet Engineering Task Force (IETF) RFC 7230
    Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing

    View Slide

  74. @brainwane
    Header casing
    as secret info channel –
    bad idea

    View Slide

  75. @brainwane
    A popular header
    Host

    View Slide

  76. @brainwane
    A required header
    Host
    required in request messages

    View Slide

  77. @brainwane
    $ netcat myhostname.tld 80
    GET /bicycle HTTP/1.1
    Host: myhostname.tld
    Host & path work together

    View Slide

  78. @brainwane
    Host & path work together

    View Slide

  79. @brainwane
    Host & path work together

    View Slide

  80. @brainwane
    Host & path work together

    View Slide

  81. @brainwane
    Host & path work together

    View Slide

  82. @brainwane
    A popular header
    Host
    (wait –
    why do we need to repeat this?
    It's in the URL!
    right?)

    View Slide

  83. @brainwane
    How Host helps
    HTTP
    is separate from
    the Domain Name System

    View Slide

  84. @brainwane
    How Host helps
    Host
    helps route requests
    among different domains
    that sit on the same server

    View Slide

  85. @brainwane
    Examples of virtual hosts
    www.debian.org

    View Slide

  86. @brainwane
    Examples of virtual hosts
    bugs.debian.org

    View Slide

  87. @brainwane
    Examples of virtual hosts
    lists.debian.org

    View Slide

  88. @brainwane
    Examples of virtual hosts
    wiki.debian.org

    View Slide

  89. @brainwane
    But watch out...

    View Slide

  90. @brainwane
    But watch out...

    View Slide

  91. @brainwane
    A spam story

    View Slide

  92. @brainwane
    A spam story
    My 404 logs (Drupal admin console):
    TYPE page not found
    DATE Thursday, October 9, 2014 - 10:46
    USER Anonymous (not verified)
    LOCATION http://myphishingsite.biz/http://myphishingsite.biz
    REFERRER
    MESSAGE ttp://myphishingsite.biz
    SEVERITY warning
    HOSTNAME [IP address]

    View Slide

  93. @brainwane
    A spam story
    My 404 logs (Drupal admin console):
    TYPE page not found
    DATE Thursday, October 9, 2014 - 10:46
    USER Anonymous (not verified)
    LOCATION http://myphishingsite.biz/http://myphishingsite.biz
    REFERRER
    MESSAGE ttp://myphishingsite.biz
    SEVERITY warning
    HOSTNAME [IP address]

    View Slide

  94. @brainwane
    A spam story
    My access logs:
    [IP address] ­ ­
    [09/Oct/2014:10:46:09 ­0400]
    "GET http://myphishingsite.biz
    HTTP/1.1" 404 7574 "­" [User­Agent]

    View Slide

  95. @brainwane
    A spam story
    Legit mistakes would look like:
    [IP address] ­ ­
    [09/Oct/2014:10:46:09 ­0400]
    "GET /http://berkeley.edu HTTP/1.1"
    404 7574 "­" [User­Agent]

    View Slide

  96. @brainwane
    A spam story
    Intentionally malform your request!
    $ netcat myhostname.tld 80
    GET http://spam.com HTTP/1.1
    Host: spam.com

    View Slide

  97. @brainwane
    A spam story
    Intentionally malform your request!
    $ netcat myhostname.tld 80
    GET /viagra­bitcoin HTTP/1.1
    Host: spam.com

    View Slide

  98. @brainwane
    404 spamming –
    bad idea

    View Slide

  99. @brainwane
    Define your own header!
    “Header fields are fully extensible: there is no limit
    on the introduction of new field names, each
    presumably defining new semantics, nor on the
    number of header fields used in a given message.”
    -(RFC 7230)

    View Slide

  100. @brainwane
    Define your own header!
    X­blah­blah­blah

    View Slide

  101. @brainwane
    Define your own header!
    X­Wikimedia­Debug

    View Slide

  102. @brainwane
    Define your own header!
    X­Wikimedia­Debug
    an HTTP request header

    Backend selection (Varnish)

    Caching behavior

    Request profiling (record a trace)

    Debug logs

    Read-only mode

    Browser extensions
    More: https://wikitech.wikimedia.org/wiki/X-
    Wikimedia-Debug

    View Slide

  103. @brainwane
    Define your own header!

    View Slide

  104. @brainwane
    Define your own header!

    View Slide

  105. @brainwane
    Define your own header!

    View Slide

  106. @brainwane
    Define your own header!

    View Slide

  107. @brainwane
    Define your own header!
    https://gitlab.com/http-can-do-that/novel-titles

    View Slide

  108. @brainwane
    Define your own header!

    View Slide

  109. @brainwane
    Defining your own headers –
    good idea?

    View Slide

  110. @brainwane
    Status codes

    View Slide

  111. @brainwane
    Status codes
    100 & 101: Informational
    2xx: Successful
    3xx: Redirection
    4xx: Client error
    5xx: Server error

    View Slide

  112. @brainwane
    Status (response) codes
    404 Not Found
    Code Reason-phrase

    View Slide

  113. @brainwane
    Status (response) codes
    “A client SHOULD ignore
    the reason-phrase content.”

    View Slide

  114. @brainwane
    Heard of these?

    410 Gone
    It was here, but now it’s not.

    304 Not Modified
    You said, ‘GET this, if it’s been modified since
    [date]’. It hasn’t been.

    View Slide

  115. @brainwane
    451 Unavailable For Legal
    Reasons
    Server is legally required
    to reject client’s request

    View Slide

  116. @brainwane
    451 Unavailable For Legal
    Reasons
    Can’t let you see that; it’s censored.

    View Slide

  117. @brainwane
    451 Unavailable For Legal
    Reasons
    “This is considered a client-side error even
    though the request is well formed and the legal
    requirement exists on the server side. After all, that
    representation was censored for a reason. There
    must be something wrong with you, citizen.”
    -RESTful Web APIs,
    Leonard Richardson & Mike Amundsen

    View Slide

  118. @brainwane
    451 Unavailable For Legal
    Reasons

    View Slide

  119. @brainwane
    451 – good idea?

    View Slide

  120. @brainwane
    WTF responses
    All of these
    were found in the wild

    View Slide

  121. @brainwane
    WTF responses
    Code: 126
    Reason: Incorrect key file for table
    '/tmp/mysqltmp/#sql_13fb_2.MYI'; try to repair it
    SQL=SHOW FULL COLUMNS FROM
    `y4dnu_extensions`

    View Slide

  122. @brainwane
    WTF responses
    Code: 301
    Reason: explicit_header_response_code

    View Slide

  123. @brainwane
    WTF responses
    Code: 403
    Reason: You've got to ask yourself one question:
    Do I feel lucky?

    View Slide

  124. @brainwane
    WTF responses
    Code: 403
    Reason: can't put wasabi in bed

    View Slide

  125. @brainwane
    WTF responses
    Code: 404
    Reason: HTTP/1.1 404

    View Slide

  126. @brainwane
    WTF responses
    Code: 404
    Reason: Not Found"); ?>Control: no-store, no-cache, must-revalidate

    View Slide

  127. @brainwane
    WTF responses
    Code: 200
    Reason: Forbidden

    View Slide

  128. @brainwane
    WTF responses
    Code: 404
    Reason: Apple WebObjects

    View Slide

  129. @brainwane
    WTF responses
    Code: 404
    Reason: forbidden

    View Slide

  130. @brainwane
    WTF responses
    Code: 434
    Reason: HTTP/1.1 434

    View Slide

  131. @brainwane
    WTF responses
    Code: 451
    Reason: Unknown Reason-Phrase

    View Slide

  132. @brainwane
    WTF responses
    Code: 503
    Reason: Backend is unhealthy

    View Slide

  133. @brainwane
    WTF responses
    Code: 520
    Reason: Origin Error

    View Slide

  134. @brainwane
    WTF responses
    Code: 525
    Reason: Origin SSL Handshake Error

    View Slide

  135. @brainwane
    WTF responses
    Code: 533
    Reason: mtd::http: Unknown: Banned

    View Slide

  136. @brainwane
    WTF responses
    Code: 732
    Reason:
    http://www.[hostname].com/intro/copyright.php

    View Slide

  137. @brainwane
    WTF responses
    Code: 999
    Reason: Request denied

    View Slide

  138. @brainwane
    Changing Reason-phrases
    more at https://gitlab.com/http-can-do-that

    View Slide

  139. @brainwane

    View Slide

  140. @brainwane

    View Slide

  141. @brainwane
    Bespoke status codes/reasons –
    good idea?

    View Slide

  142. @brainwane
    Conclusion

    View Slide

  143. @brainwane
    There’s so much more

    “Don’t cache this”

    Pragma – pass instructions to server/client

    CONNECT, TRACE, LINK, & UNLINK methods

    409 Conflict

    Look-before-you-leap requests

    Resources at HTTPS vs. HTTP URLs can differ

    “q” and preference ranking in the Accept header

    Content-Disposition (e.g. “attachment”)

    View Slide

  144. @brainwane
    The feeling of power
    The sense of wonder

    View Slide

  145. @brainwane
    What might the web have been?
    What might it still be?

    View Slide

  146. @brainwane
    Read & play

    RFCs 7230-7235

    requests

    netcat, wget, netstat, telnet

    basic HTTP servers (in your favorite language)

    https://gitlab.com/http-can-do-that

    View Slide

  147. @brainwane
    Thanks
    Leonard Richardson
    Greg Hendershott
    Zack Weinberg
    The Recurse Center
    Clay Hallock
    Paul Tagliamonte
    Open Source Bridge
    Julia Evans, Allison Kaptur, Amy Hanlon, and
    Katie Silverio

    View Slide

  148. Thank you
    Sumana Harihareswara
    http://changeset.nyc
    @brainwane
    https://gitlab.com/http-can-do-that
    [email protected]

    View Slide