A look into HTTP.rb (And why you shouldn't use Net::HTTP)

A look into HTTP.rb (And why you shouldn't use Net::HTTP)

376e4eb9dc6c2e33d1330262edc4f109?s=128

Janko Marohnić

October 26, 2017
Tweet

Transcript

  1. A Look Into HTTP.rb And why you shouldn’t use Net::HTTP

    janko-m @jankomarohnic
  2. Implementation Net::HTTP variants Net::HTTP pure ruby REST Client Net::HTTP HTTParty

    Net::HTTP open-uri Net::HTTP libcurl variants Typhoeus libcurl Curb libcurl Patron libcurl wrapper Faraday wrapper pure ruby HTTP.rb pure ruby HTTPClient pure ruby https://www.slideshare.net/HiroshiNakamura/rubyhttp-clients-comparison
  3. Let’s just use Net::HTTP What could possibly go wrong?

  4. Net::HTTP.get(URI("https://example.com")) #=> "<!doctype html>…" Net::HTTP.get_response(URI("https://example.com")) #=> #<Net::HTTPOK 200 OK readbody=true>

    Net::HTTP.post(URI("https://example.com")) #=> #<Net::HTTPOK 200 OK readbody=true> Net::HTTP.post_form(URI("https://example.com"), {}) #=> #<Net::HTTPOK 200 OK readbody=true> Net::HTTP.put(URI("https://example.com")) # NoMethodError: undefined method `put’ Net::HTTP.delete(URI("https://example.com")) # NoMethodError: undefined method `delete’
  5. None
  6. uri = URI.parse("https://example.com/path") use_ssl = uri.is_a?(URI::HTTPS) Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl)

    do |http| http.post(uri.path, URI.encode_www_form(params)) end 1. parse the URL string 2. determine whether we need to use SSL 3. open the TCP connection 4. encode the post parameters 5. send the request
  7. uri = URI.parse("http://example.com/path") begin Net::HTTP.start(uri.host, uri.port) do |http| http.post(uri.path, URI.encode_www_form(params))

    end rescue SocketError, EOFError, IOError, Errno::ECONNABORTED, Errno::ECONNRESET, Errno::EINVAL, Errno::ETIMEDOUT, Errno::EHOSTUNREACH, Errno::ENETUNREACH, Errno::ECONNREFUSED, Errno::EPIPE retry end
  8. uri = URI.parse("http://example.com/path") begin Net::HTTP.start(uri.host, uri.port) do |http| http.post(uri.path, URI.encode_www_form(params))

    end rescue SocketError, EOFError, IOError, SystemCallError retry end SocketError EOFError IOError SystemCallError = ConnectionError?
  9. Downsides of Net::HTTP • Wide and verbose interface • e.g.

    3 mutually inconsistent ways of making requests • Poor OO design • Exposes low-level exceptions • Ugly codebase (it’s in stdlib)
  10. Implementation Net::HTTP variants Net::HTTP pure ruby REST Client Net::HTTP HTTParty

    Net::HTTP open-uri Net::HTTP libcurl variants Typhoeus libcurl Curb libcurl Patron libcurl wrapper Faraday wrapper pure ruby HTTP.rb pure ruby HTTPClient pure ruby https://www.slideshare.net/HiroshiNakamura/rubyhttp-clients-comparison
  11. Implementation Net::HTTP variants Net::HTTP pure ruby REST Client Net::HTTP HTTParty

    Net::HTTP open-uri Net::HTTP libcurl variants Typhoeus libcurl Curb libcurl Patron libcurl wrapper Faraday wrapper pure ruby HTTP.rb pure ruby HTTPClient pure ruby https://www.slideshare.net/HiroshiNakamura/rubyhttp-clients-comparison
  12. HTTP.rb gem "http"

  13. uri = URI.parse("https://example.com/path") use_ssl = uri.is_a?(URI::HTTPS) Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl)

    do |http| http.post(uri.path, URI.encode_www_form(params)) end HTTP.post("https://example.com/path", form: params) Net::HTTP HTTP.rb
  14. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  15. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  16. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  17. response = HTTP.get("https://example.com") response # => #<HTTP::Response/1.1 200 OK …>

    response.status # => #<HTTP::Response::Status 200 OK> response.status.code # => 200 response.status.ok? # => true response.status.success? # => true response.headers # => #<HTTP::Headers {…}> response.headers.to_h # => {"Content-Type"=>"text/html", …} response.body # => #<HTTP::Response::Body> response.body.to_s # => "<!doctype html>…"
  18. HTTP.headers("Accept" => "application/json") .basic_auth("janko", "password") .follow(max_hops: 2) .get("http://example.com") http =

    HTTP .headers("Accept" => "application/json") .basic_auth("janko", "password") .follow(max_hops: 2) http #=> #<HTTP::Client …> http.get("http://example.com/posts") http.get("http://example.com/posts/1/comments") http.post("http//example.com/posts/1/comments", json: {…})
  19. begin response = HTTP.get("https://example.com") rescue HTTP::ConnectionError retry end HTTP::Error !""

    HTTP::ConnectionError !"" HTTP::RequestError !"" HTTP::ResponseError # $"" HTTP::StateError !"" HTTP::TimeoutError $"" HTTP::HeaderError
  20. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  21. url = "https://movies.com/matrix[1999].mp4" Net::HTTP.get_response(URI(url)) url = "https://movies.com/matrix[1999].mp4" url = URI.encode(url)

    Net::HTTP.get_response(URI(url)) url = "https://movies.com/matrix[1999].mp4" url = URI.decode(url) url = URI.encode(url) Net::HTTP.get_response(URI(url)) HTTP.get("https://movies.com/matrix[1999].mp4") #=> URI::InvalidURIError #=> URI::InvalidURIError #=> URI::InvalidURIError
  22. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  23. require "timeout" Timeout.timeout(5) do HTTP.get("https://example.com") end "Ensure" blocks might not

    get executed HTTP.timeout(connect: 3) .get("http://example.com") HTTP.timeout(connect: 3, write: 3) .get("http://example.com") HTTP.timeout(connect: 3, write: 3, read: 3) .get("http://example.com") # 5 seconds allowed for entire call HTTP.timeout(:global, connect: 1, write: 2, read: 2) .get("http://example.com")
  24. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  25. HTTP.get("https://example.com") # connect + write + read + close HTTP.get("https://example.com")

    # connect + write + read + close HTTP.get("https://example.com") # connect + write + read + close HTTP.persistent("https://example.com") do |http| http.get("/") # connect + write + read http.get("/") # write + read http.get("/") # write + read end # close Typhoeus.get("https://example.com") # connect + write + read Typhoeus.get("https://example.com") # write + read Typhoeus.get("https://example.com") # write + read
  26. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  27. source = Transcoder.call("matrix.mp4") source.size #=> 500 MB destination = File.open("matrix-transcoded.mp4",

    "w") while (chunk = source.read(16*1024, buffer ||= "")) destination.write(chunk) end destination.write(source.read) IO.copy_stream(source, destination)
  28. require "socket" socket = TCPSocket.open("example.com", 80) socket.write "GET / HTTP/1.1"

    + "\r\n" + "Host: example.com" + "\r\n" + "Content-Length: 0" + "\r\n" + "Connection: close" + "\r\n" + "\r\n" socket.read #=> "HTTP/1.1 200 OK" + "\r\n" + # "Content-Type: text/html" + "\r\n" + # "Content-Length: 1270" + "\r\n" + # "Connection: close" + "\r\n" + # "\r\n" + # "<!doctype html> …" socket.close
  29. require "socket" socket = TCPSocket.open("example.com", 80) socket.write "GET / HTTP/1.1"

    + "\r\n" + "Host: example.com" + "\r\n" + "Content-Length: #{body.size}" + "\r\n" + "Connection: close" + "\r\n" + "\r\n" socket.write body.read socket.read #=> "HTTP/1.1 200 OK" + "\r\n" + # "Content-Type: text/html" + "\r\n" + # "Content-Length: 1270" + "\r\n" + # "Connection: close" + "\r\n" + # "\r\n" + # "<!doctype html> …" socket.close
  30. require "socket" socket = TCPSocket.open("example.com", 80) socket.write "GET / HTTP/1.1"

    + "\r\n" + "Host: example.com" + "\r\n" + "Content-Length: #{body.size}" + "\r\n" + "Connection: close" + "\r\n" + "\r\n" IO.copy_stream(body, socket) # streaming! socket.read #=> "HTTP/1.1 200 OK" + "\r\n" + # "Content-Type: text/html" + "\r\n" + # "Content-Length: 1270" + "\r\n" + # "Connection: close" + "\r\n" + # "\r\n" + # "<!doctype html> …" socket.close
  31. require "socket" socket = TCPSocket.open("example.com", 80) socket.write "GET / HTTP/1.1"

    + "\r\n" + "Host: example.com" + "\r\n" + "Content-Length: #{body.size}" + "\r\n" + "Connection: close" + "\r\n" + "\r\n" IO.copy_stream(body, socket) # streaming! while (chunk = socket.readpartial(16*1024)) # streaming! # parse response end socket.close
  32. HTTP.post(url, body: "this is my body") # string HTTP.post(url, body:

    enumerable) # #each HTTP.post(url, body: io) # #read & #size # File streaming HTTP.post(url, body: File.open("path/to/file.txt")) # StringIO streaming HTTP.post(url, body: StringIO.new("content")) # Pipe streaming HTTP.post(url, body: IO.popen("shell command")) # Multipart form data streaming HTTP.post(url, form: { file: HTTP::FormData::File.new("path/to/file.txt") })
  33. response = HTTP.get("http://example.com/export.csv") response.body # nothing has been read yet

    response.body.to_s # reads whole response body # or response.body.readpartial # reads first chunk response.body.readpartial # reads next chunk # or response.body.each { |chunk| … } # yields chunks response = HTTP.get("http://example.com/export.csv") # reading headers before download fail "too large" if response.content_length > max_size # streaming download to disk File.open("export.csv", "w") do |file| response.body.each do |chunk| file.write(chunk) end end
  34. Streaming bodies ↓ Constant memory usage Less disk I/O

  35. • Pure ruby implementation • Clean and Chainable API •

    Correct URL parsing • Native timeouts • Persistent connections • Streaming bodies • Compressing and decompressing bodies
  36. HTTP.post(url, body: File.open("file.txt")) # as is # POST /path HTTP/1.1

    # Content-Length: 423847673 # Content-Encoding: identity # # [raw content] # HTTP/1.1 200 OK Request body
  37. HTTP.use(:auto_deflate) post(url, body: File.open("file.txt")) # compression # POST /path HTTP/1.1

    # Content-Length: 214328782 # Content-Encoding: gzip # # [compressed content] # HTTP/1.1 200 OK Request body
  38. HTTP.get("http://example.com/file.txt") # GET /file.txt HTTP/1.1 # HTTP/1.1 200 OK #

    Content-Length: 423847673 # Content-Encoding: identity # # [raw content] response.body.each { |chunk| … } # as is Response body
  39. HTTP.use(:auto_inflate) get("http://example.com/file.txt") # GET /file.txt HTTP/1.1 # HTTP/1.1 200 OK

    # Content-Length: 214328782 # Content-Encoding: gzip # # [compressed content] response.body.each { |chunk| … } # decompression Response body
  40. Compressed bodies ↓ Faster upload/download Less network resources

  41. Celluloid, Reel, Socketry (Nio4r), …

  42. https://github.com/httprb/http