Call for a better error handling in APIs

André Cedik · Developer Advocate shipcloud.io · [email protected] shipcloud GmbH
· Mittelweg 162 20148 Hamburg

API v2 • Incoporate learnings from more than 5 years
• Easier integration in existing shop- / ERP-software • I18n • Returning translated strings • White label • Design first approach using OpenAPI

But most of all ...

... better error communication!

shipcloud error communication

History of errors

Mariner 1 • Veered off course, because unscheduled maneuver •
Steering unpossible • Missing hyphen in the code allowed transmission of incorrect guidance signals • Engineers hit self destruct button • $18 million error

ESA Ariane 5 Flight 501 • Reused software from Ariane
4 • Ariane 5 had faster engines • Software tried to push a 64-bit float into a signed 16- bit integer • Engineers hit self destruct button at 37 sec. into its maiden launch • $8 billion error

NASA Mars Climate Orbiter • Failed conversion from imperial units
to metric • Send the orbiter too close to Mars‘ surface • $125 million error

Y2K Bug • Year numbers where saved with 2 digits
(98,99,00,01) • No one knew what will happen when the year 2000 sets in • Since ‘00‘ also meant 1900 • $500 billion error

Pentium-FDIV-Bug • In 1994 Math Prof. Thomas R. Nicely reported
the bug • Processor might return incorrect binary floating point results when dividing a number • Intel attributed error to missing entries in the lookup table • Tried to downplay the bug • Had to replace processors • $475 million error

Miscommunication

Mokusatsu - The World‘s Most Tragic Translation • Allied leaders
called for Japan’s unconditional surrender • Japanese government said nothing while considering their options • PM Kantaro Suzuki was pressured for comment • said only one word "mokusatsu“ • Mistranslation leads to the dropping of the atomic bomb

Hawaii Missle Strike • In Jan. 2018 citizens of Hawaii
were warned of an inbound ballistic missile strike • Turned out to be a false alert • Recording over phone „EXERCISE“ • Message with „THIS IS NOT A DRILL“ • Same UI used for drill and real alerts • No safeguards were in place • It took 38 minutes to retract the alert, because there was no response protocol for a false alert

Error handling in APIs Tools used at the moment

"Building fault-tolerant software boils down to detecting errors and doing
something when errors are detected" Joe Armstrong, inventor of Erlang

http response status codes • Informational 1xx • Successful 2xx
• 200 OK • Redirection 3xx • 301 Moved Permanently • Client Error 4xx • Server Error 5xx

http response status codes 4xx & 5xx • 400 Bad
Request • 402 Payment Required • 403 Forbidden • 404 Not Found • 500 Internal Server Error • 502 Bad Gateway • 504 Gateway Timeout

Error handling in the body • Good: • Return complex
structures • Get more specific about an error • Convey multiple errors • Bad: • Everyone has their own way of doing it • Therefore developers have to understand „the way“

Error handling in APIs The bad

shipcloud error communication

Sabre Dev Studio – error attribute • „error“ is always
a string • Sometimes all Uppercase - > seems to be like an error code

Sabre Dev Studio – error attribute • „error“ is always
a string • Sometimes all Uppercase - > seems to be like an error code • Sometimes it looks like an error trace

Sabre Dev Studio – code attribute • Sometimes it looks
like a http response status code • Sometimes like an internal code • Same code used more than once • different error text • 102 • 111 • 404 • 500 • 700101 • 050002 • 060016 • 700202

Sabre Dev Studio – code attribute

Pitney Bowes • Each validated field has its own error
code

Pitney Bowes • Each validated field has its own error
code • "XXX is invalid, unsupported or missing“ • So what is it now?!?

Google Maps Geocoding API • Has a „status“ attribute •
OK, ZERO_RESULTS, OVER_DAILY_LIMIT, OVER_QUERY_LIMIT, REQUEST_DENIED, INVALID_REQUEST, UNKNOWN_ERROR • INVALID_REQUEST = 400 Bad Request • OVER_QUERY_LIMIT = 200 OK

Google Drive API v3

Klarna

API Football

Error handling in APIs The good parts

squarespace

Facebook GraphAPI

Facebook Marketing API

Banks API

Figo.io

What we can do better

What‘s the problem? • „API calls either fail or are
successful“ – Phil Sturgeon • „Soft errors“ • Not an exception type „crash“ • More like a warning

application/problem+json • RFC 7807 • Pros • Own content type
• Predefined set of attributes • Extensible • Cons • Not encapsulated • Mixing with other content • Multi error handling just for one error type HTTP/1.1 403 Forbidden Content-Type: application/problem+json Content-Language: en { "type": "https://example.com/probs/out-of-credit", "title": "You do not have enough credit.", "detail": "Your current balance is 30, but that costs 50.", "instance": "/account/12345/msgs/abc", "balance": 30, "accounts": ["/account/12345", "/account/67890"] }

Warning header • RFC 7234 • Pros • You could
handle „soft errors“ • Multiple warning header for multiple different errors • Cons • Complex data can‘t be returned • It‘s just a string HTTP/1.1 200 OK Date: Sat, 25 Aug 2012 23:34:45 GMT Warning: 112 - "network down" "Sat, 25 Aug 2012 23:34:45 GMT"

application/health+json • Internet Draft inadarei-api- health-check • Pros • „status“
attribute („pass“, „warn“, „error“) • Cons • Specific to health of an api • Overhead content HTTP/1.1 200 OK Content-Type: application/health+json { "status": "pass", "version": "1", "releaseId": "1.2.2", "notes": [""], "output": "", "serviceId": "f03e522f-1f44-4062-9b55-9587f91c9c41", "description": "health of authz service", "checks": { "cassandra:responseTime": [ { "componentId": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2", "componentType": "datastore", "observedValue": 250, "observedUnit": "ms", "status": "pass", "affectedEndpoints" : [ "/users/{userId}", "/customers/{customerId}/status", "/shopping/{anything}" ], "time": "2018-01-17T03:36:48Z", "output": "" }

application/vnd.api+json • JSON:API standard • Pros • Errors array to
handle multiple errors • JSON pointers to show devs where an error has occurred • Cons • Everything is in errors object • „soft errors“ not possible

Best current practice • HTTP status code • Error object
• Easy referencing of errors • „Code“ • „Subcode“ • Request IDs in the body • for easier request identification in support cases • Human readable message • In multiple languages

"An excellent error message is precise and lets the user
know about the nature of the error so that they can figure their way out of it." Guy Levin, RestCase

Future of error handling in APIs A proposal

A proposal – to be discussed

A proposal – to be discussed • Responses in a
new format • „data“ holds everything we‘d normally have in the root • „errors“ and „warnings“ give information about what happened • „errors“ and „warnings“ follow the RFC 7807 pattern

A proposal – to be discussed • „data“ is empty
since no resource was created • Warnings possible if api supports this use case

Questions? Open Discussion @andrecedik

Sources • Atlas Agena with Mariner 1: NASA, https://commons.wikimedia.org/wiki/File:Atlas_Agena_with_Mariner_1.jpg •
Ariane 5: DLR German Aerospace Center, https://www.flickr.com/photos/48213136@N06/8958839420 • Mars Climate Orbiter: NASA/JPL/Corby Waste, https://commons.wikimedia.org/wiki/File:Mars_Climate_Orbiter_2.jpg • Bug de l'an 2000: https://commons.wikimedia.org/wiki/File:Bug_de_l%27an_2000.jpg • Pentium: Konstantin Lanzet, https://commons.wikimedia.org/wiki/File:KL_Intel_Pentium_A80501.jpg • Hawaii Missle Alert SMS: https://twitter.com/tulsigabbard/status/952243723525677056

Call for a better error handling in APIs

Call for a better error handling in APIs

More Decks by André Cedik

Other Decks in Technology

Featured

Transcript