Slide 4
Slide 4 text
Error handling is hard to get right
"Without correct error propagation,
any comprehensive failure policy is
useless … We find that error
handling is occasionally correct.
Specifically, we see that low-level
errors are sometimes lost as they
travel through [...] many layers [...]"
EIO: Error handling is occasionally correct.
H. S. Gunawi, et al. In Proc. of the 6th USENIX Conference on
File and Storage Technologies, FAST’08, 2008.
"Almost all catastrophic failures
(92%) are the result of incorrect
handling of non-fatal errors explicitly
signaled in software"
Simple Testing Can Prevent Most Critical
Failures: An Analysis of Production Failures in
Distributed Data-Intensive Systems.
Ding Yuan, et al., University of Toronto. In Proceedings of the
11th USENIX Symposium on Operating Systems Design and
Implementation, OSDI14, 2014