Call for a better error handling in APIs

Call for a better error handling in APIs

"API calls are either successful or they fail" – at least that’s the concept current web protocols are based upon. But in the real world, it isn’t always that simple. There are use cases where a 3rd party system returns something that can’t be defined as a clear error, but returning a simple http status code of 200 OK without any additional information is also wrong. For those cases, every api provide has their own way of communicating to the client what has gone wrong in the backend. This is why we need to find a common way to communicate errors and especially those edge cases.

571a6d8fc5d83fadac6043ac78c5aeac?s=128

André Cedik

October 16, 2019
Tweet

Transcript

  1. 3.

    API v2 • Incoporate learnings from more than 5 years

    • Easier integration in existing shop- / ERP-software • I18n • Returning translated strings • White label • Design first approach using OpenAPI
  2. 8.

    Mariner 1 • Veered off course, because unscheduled maneuver •

    Steering unpossible • Missing hyphen in the code allowed transmission of incorrect guidance signals • Engineers hit self destruct button • $18 million error
  3. 9.

    ESA Ariane 5 Flight 501 • Reused software from Ariane

    4 • Ariane 5 had faster engines • Software tried to push a 64-bit float into a signed 16- bit integer • Engineers hit self destruct button at 37 sec. into its maiden launch • $8 billion error
  4. 10.

    NASA Mars Climate Orbiter • Failed conversion from imperial units

    to metric • Send the orbiter too close to Mars‘ surface • $125 million error
  5. 11.

    Y2K Bug • Year numbers where saved with 2 digits

    (98,99,00,01) • No one knew what will happen when the year 2000 sets in • Since ‘00‘ also meant 1900 • $500 billion error
  6. 12.

    Pentium-FDIV-Bug • In 1994 Math Prof. Thomas R. Nicely reported

    the bug • Processor might return incorrect binary floating point results when dividing a number • Intel attributed error to missing entries in the lookup table • Tried to downplay the bug • Had to replace processors • $475 million error
  7. 14.

    Mokusatsu - The World‘s Most Tragic Translation • Allied leaders

    called for Japan’s unconditional surrender • Japanese government said nothing while considering their options • PM Kantaro Suzuki was pressured for comment • said only one word "mokusatsu“ • Mistranslation leads to the dropping of the atomic bomb
  8. 15.

    Hawaii Missle Strike • In Jan. 2018 citizens of Hawaii

    were warned of an inbound ballistic missile strike • Turned out to be a false alert • Recording over phone „EXERCISE“ • Message with „THIS IS NOT A DRILL“ • Same UI used for drill and real alerts • No safeguards were in place • It took 38 minutes to retract the alert, because there was no response protocol for a false alert
  9. 17.

    "Building fault-tolerant software boils down to detecting errors and doing

    something when errors are detected" Joe Armstrong, inventor of Erlang
  10. 18.

    http response status codes • Informational 1xx • Successful 2xx

    • 200 OK • Redirection 3xx • 301 Moved Permanently • Client Error 4xx • Server Error 5xx
  11. 19.

    http response status codes 4xx & 5xx • 400 Bad

    Request • 402 Payment Required • 403 Forbidden • 404 Not Found • 500 Internal Server Error • 502 Bad Gateway • 504 Gateway Timeout
  12. 20.

    Error handling in the body • Good: • Return complex

    structures • Get more specific about an error • Convey multiple errors • Bad: • Everyone has their own way of doing it • Therefore developers have to understand „the way“
  13. 23.

    Sabre Dev Studio – error attribute • „error“ is always

    a string • Sometimes all Uppercase - > seems to be like an error code
  14. 24.

    Sabre Dev Studio – error attribute • „error“ is always

    a string • Sometimes all Uppercase - > seems to be like an error code • Sometimes it looks like an error trace
  15. 25.

    Sabre Dev Studio – code attribute • Sometimes it looks

    like a http response status code • Sometimes like an internal code • Same code used more than once • different error text • 102 • 111 • 404 • 500 • 700101 • 050002 • 060016 • 700202
  16. 28.

    Pitney Bowes • Each validated field has its own error

    code • "XXX is invalid, unsupported or missing“ • So what is it now?!?
  17. 29.

    Google Maps Geocoding API • Has a „status“ attribute •

    OK, ZERO_RESULTS, OVER_DAILY_LIMIT, OVER_QUERY_LIMIT, REQUEST_DENIED, INVALID_REQUEST, UNKNOWN_ERROR • INVALID_REQUEST = 400 Bad Request • OVER_QUERY_LIMIT = 200 OK
  18. 31.
  19. 38.
  20. 39.
  21. 41.

    What‘s the problem? • „API calls either fail or are

    successful“ – Phil Sturgeon • „Soft errors“ • Not an exception type „crash“ • More like a warning
  22. 42.
  23. 43.

    application/problem+json • RFC 7807 • Pros • Own content type

    • Predefined set of attributes • Extensible • Cons • Not encapsulated • Mixing with other content • Multi error handling just for one error type HTTP/1.1 403 Forbidden Content-Type: application/problem+json Content-Language: en { "type": "https://example.com/probs/out-of-credit", "title": "You do not have enough credit.", "detail": "Your current balance is 30, but that costs 50.", "instance": "/account/12345/msgs/abc", "balance": 30, "accounts": ["/account/12345", "/account/67890"] }
  24. 44.

    Warning header • RFC 7234 • Pros • You could

    handle „soft errors“ • Multiple warning header for multiple different errors • Cons • Complex data can‘t be returned • It‘s just a string HTTP/1.1 200 OK Date: Sat, 25 Aug 2012 23:34:45 GMT Warning: 112 - "network down" "Sat, 25 Aug 2012 23:34:45 GMT"
  25. 45.

    application/health+json • Internet Draft inadarei-api- health-check • Pros • „status“

    attribute („pass“, „warn“, „error“) • Cons • Specific to health of an api • Overhead content HTTP/1.1 200 OK Content-Type: application/health+json { "status": "pass", "version": "1", "releaseId": "1.2.2", "notes": [""], "output": "", "serviceId": "f03e522f-1f44-4062-9b55-9587f91c9c41", "description": "health of authz service", "checks": { "cassandra:responseTime": [ { "componentId": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2", "componentType": "datastore", "observedValue": 250, "observedUnit": "ms", "status": "pass", "affectedEndpoints" : [ "/users/{userId}", "/customers/{customerId}/status", "/shopping/{anything}" ], "time": "2018-01-17T03:36:48Z", "output": "" }
  26. 46.

    application/vnd.api+json • JSON:API standard • Pros • Errors array to

    handle multiple errors • JSON pointers to show devs where an error has occurred • Cons • Everything is in errors object • „soft errors“ not possible
  27. 47.

    Best current practice • HTTP status code • Error object

    • Easy referencing of errors • „Code“ • „Subcode“ • Request IDs in the body • for easier request identification in support cases • Human readable message • In multiple languages
  28. 48.

    "An excellent error message is precise and lets the user

    know about the nature of the error so that they can figure their way out of it." Guy Levin, RestCase
  29. 51.

    A proposal – to be discussed • Responses in a

    new format • „data“ holds everything we‘d normally have in the root • „errors“ and „warnings“ give information about what happened • „errors“ and „warnings“ follow the RFC 7807 pattern
  30. 52.

    A proposal – to be discussed • „data“ is empty

    since no resource was created • Warnings possible if api supports this use case
  31. 54.

    Sources • Atlas Agena with Mariner 1: NASA, https://commons.wikimedia.org/wiki/File:Atlas_Agena_with_Mariner_1.jpg •

    Ariane 5: DLR German Aerospace Center, https://www.flickr.com/photos/48213136@N06/8958839420 • Mars Climate Orbiter: NASA/JPL/Corby Waste, https://commons.wikimedia.org/wiki/File:Mars_Climate_Orbiter_2.jpg • Bug de l'an 2000: https://commons.wikimedia.org/wiki/File:Bug_de_l%27an_2000.jpg • Pentium: Konstantin Lanzet, https://commons.wikimedia.org/wiki/File:KL_Intel_Pentium_A80501.jpg • Hawaii Missle Alert SMS: https://twitter.com/tulsigabbard/status/952243723525677056