Slide 1

Slide 1 text

Call for a better error handling in APIs

Slide 2

Slide 2 text

André Cedik · Developer Advocate shipcloud.io · andre@shipcloud.io shipcloud GmbH · Mittelweg 162 20148 Hamburg

Slide 3

Slide 3 text

API v2 • Incoporate learnings from more than 5 years • Easier integration in existing shop- / ERP-software • I18n • Returning translated strings • White label • Design first approach using OpenAPI

Slide 4

Slide 4 text

But most of all ...

Slide 5

Slide 5 text

... better error communication!

Slide 6

Slide 6 text

shipcloud error communication

Slide 7

Slide 7 text

History of errors

Slide 8

Slide 8 text

Mariner 1 • Veered off course, because unscheduled maneuver • Steering unpossible • Missing hyphen in the code allowed transmission of incorrect guidance signals • Engineers hit self destruct button • $18 million error

Slide 9

Slide 9 text

ESA Ariane 5 Flight 501 • Reused software from Ariane 4 • Ariane 5 had faster engines • Software tried to push a 64-bit float into a signed 16- bit integer • Engineers hit self destruct button at 37 sec. into its maiden launch • $8 billion error

Slide 10

Slide 10 text

NASA Mars Climate Orbiter • Failed conversion from imperial units to metric • Send the orbiter too close to Mars‘ surface • $125 million error

Slide 11

Slide 11 text

Y2K Bug • Year numbers where saved with 2 digits (98,99,00,01) • No one knew what will happen when the year 2000 sets in • Since ‘00‘ also meant 1900 • $500 billion error

Slide 12

Slide 12 text

Pentium-FDIV-Bug • In 1994 Math Prof. Thomas R. Nicely reported the bug • Processor might return incorrect binary floating point results when dividing a number • Intel attributed error to missing entries in the lookup table • Tried to downplay the bug • Had to replace processors • $475 million error

Slide 13

Slide 13 text

Miscommunication

Slide 14

Slide 14 text

Mokusatsu - The World‘s Most Tragic Translation • Allied leaders called for Japan’s unconditional surrender • Japanese government said nothing while considering their options • PM Kantaro Suzuki was pressured for comment • said only one word "mokusatsu“ • Mistranslation leads to the dropping of the atomic bomb

Slide 15

Slide 15 text

Hawaii Missle Strike • In Jan. 2018 citizens of Hawaii were warned of an inbound ballistic missile strike • Turned out to be a false alert • Recording over phone „EXERCISE“ • Message with „THIS IS NOT A DRILL“ • Same UI used for drill and real alerts • No safeguards were in place • It took 38 minutes to retract the alert, because there was no response protocol for a false alert

Slide 16

Slide 16 text

Error handling in APIs Tools used at the moment

Slide 17

Slide 17 text

"Building fault-tolerant software boils down to detecting errors and doing something when errors are detected" Joe Armstrong, inventor of Erlang

Slide 18

Slide 18 text

http response status codes • Informational 1xx • Successful 2xx • 200 OK • Redirection 3xx • 301 Moved Permanently • Client Error 4xx • Server Error 5xx

Slide 19

Slide 19 text

http response status codes 4xx & 5xx • 400 Bad Request • 402 Payment Required • 403 Forbidden • 404 Not Found • 500 Internal Server Error • 502 Bad Gateway • 504 Gateway Timeout

Slide 20

Slide 20 text

Error handling in the body • Good: • Return complex structures • Get more specific about an error • Convey multiple errors • Bad: • Everyone has their own way of doing it • Therefore developers have to understand „the way“

Slide 21

Slide 21 text

Error handling in APIs The bad

Slide 22

Slide 22 text

shipcloud error communication

Slide 23

Slide 23 text

Sabre Dev Studio – error attribute • „error“ is always a string • Sometimes all Uppercase - > seems to be like an error code

Slide 24

Slide 24 text

Sabre Dev Studio – error attribute • „error“ is always a string • Sometimes all Uppercase - > seems to be like an error code • Sometimes it looks like an error trace

Slide 25

Slide 25 text

Sabre Dev Studio – code attribute • Sometimes it looks like a http response status code • Sometimes like an internal code • Same code used more than once • different error text • 102 • 111 • 404 • 500 • 700101 • 050002 • 060016 • 700202

Slide 26

Slide 26 text

Sabre Dev Studio – code attribute

Slide 27

Slide 27 text

Pitney Bowes • Each validated field has its own error code

Slide 28

Slide 28 text

Pitney Bowes • Each validated field has its own error code • "XXX is invalid, unsupported or missing“ • So what is it now?!?

Slide 29

Slide 29 text

Google Maps Geocoding API • Has a „status“ attribute • OK, ZERO_RESULTS, OVER_DAILY_LIMIT, OVER_QUERY_LIMIT, REQUEST_DENIED, INVALID_REQUEST, UNKNOWN_ERROR • INVALID_REQUEST = 400 Bad Request • OVER_QUERY_LIMIT = 200 OK

Slide 30

Slide 30 text

Google Drive API v3

Slide 31

Slide 31 text

Klarna

Slide 32

Slide 32 text

API Football

Slide 33

Slide 33 text

Error handling in APIs The good parts

Slide 34

Slide 34 text

squarespace

Slide 35

Slide 35 text

Facebook GraphAPI

Slide 36

Slide 36 text

Facebook Marketing API

Slide 37

Slide 37 text

Facebook Marketing API

Slide 38

Slide 38 text

Banks API

Slide 39

Slide 39 text

Figo.io

Slide 40

Slide 40 text

What we can do better

Slide 41

Slide 41 text

What‘s the problem? • „API calls either fail or are successful“ – Phil Sturgeon • „Soft errors“ • Not an exception type „crash“ • More like a warning

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

application/problem+json • RFC 7807 • Pros • Own content type • Predefined set of attributes • Extensible • Cons • Not encapsulated • Mixing with other content • Multi error handling just for one error type HTTP/1.1 403 Forbidden Content-Type: application/problem+json Content-Language: en { "type": "https://example.com/probs/out-of-credit", "title": "You do not have enough credit.", "detail": "Your current balance is 30, but that costs 50.", "instance": "/account/12345/msgs/abc", "balance": 30, "accounts": ["/account/12345", "/account/67890"] }

Slide 44

Slide 44 text

Warning header • RFC 7234 • Pros • You could handle „soft errors“ • Multiple warning header for multiple different errors • Cons • Complex data can‘t be returned • It‘s just a string HTTP/1.1 200 OK Date: Sat, 25 Aug 2012 23:34:45 GMT Warning: 112 - "network down" "Sat, 25 Aug 2012 23:34:45 GMT"

Slide 45

Slide 45 text

application/health+json • Internet Draft inadarei-api- health-check • Pros • „status“ attribute („pass“, „warn“, „error“) • Cons • Specific to health of an api • Overhead content HTTP/1.1 200 OK Content-Type: application/health+json { "status": "pass", "version": "1", "releaseId": "1.2.2", "notes": [""], "output": "", "serviceId": "f03e522f-1f44-4062-9b55-9587f91c9c41", "description": "health of authz service", "checks": { "cassandra:responseTime": [ { "componentId": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2", "componentType": "datastore", "observedValue": 250, "observedUnit": "ms", "status": "pass", "affectedEndpoints" : [ "/users/{userId}", "/customers/{customerId}/status", "/shopping/{anything}" ], "time": "2018-01-17T03:36:48Z", "output": "" }

Slide 46

Slide 46 text

application/vnd.api+json • JSON:API standard • Pros • Errors array to handle multiple errors • JSON pointers to show devs where an error has occurred • Cons • Everything is in errors object • „soft errors“ not possible

Slide 47

Slide 47 text

Best current practice • HTTP status code • Error object • Easy referencing of errors • „Code“ • „Subcode“ • Request IDs in the body • for easier request identification in support cases • Human readable message • In multiple languages

Slide 48

Slide 48 text

"An excellent error message is precise and lets the user know about the nature of the error so that they can figure their way out of it." Guy Levin, RestCase

Slide 49

Slide 49 text

Future of error handling in APIs A proposal

Slide 50

Slide 50 text

A proposal – to be discussed

Slide 51

Slide 51 text

A proposal – to be discussed • Responses in a new format • „data“ holds everything we‘d normally have in the root • „errors“ and „warnings“ give information about what happened • „errors“ and „warnings“ follow the RFC 7807 pattern

Slide 52

Slide 52 text

A proposal – to be discussed • „data“ is empty since no resource was created • Warnings possible if api supports this use case

Slide 53

Slide 53 text

Questions? Open Discussion @andrecedik

Slide 54

Slide 54 text

Sources • Atlas Agena with Mariner 1: NASA, https://commons.wikimedia.org/wiki/File:Atlas_Agena_with_Mariner_1.jpg • Ariane 5: DLR German Aerospace Center, https://www.flickr.com/photos/48213136@N06/8958839420 • Mars Climate Orbiter: NASA/JPL/Corby Waste, https://commons.wikimedia.org/wiki/File:Mars_Climate_Orbiter_2.jpg • Bug de l'an 2000: https://commons.wikimedia.org/wiki/File:Bug_de_l%27an_2000.jpg • Pentium: Konstantin Lanzet, https://commons.wikimedia.org/wiki/File:KL_Intel_Pentium_A80501.jpg • Hawaii Missle Alert SMS: https://twitter.com/tulsigabbard/status/952243723525677056