There are multiple good arguments there, but the most compelling one to me is about latent bugs. Fallback code runs less often, and is harder to test. We know from experience that it's more likely to have bugs. What about empirically? That too. 2/
Yuan et al ( https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf) find: "almost all ... system failures are the result of incorrect handling of non-fatal errors". Li et al ( https://www.usenix.org/system/files/conference/atc17/atc17-li_yiwen.pdf) find that the majority of kernel bugs are in unpopular code. 3/
Ok, so those aren't *quite* the same thing, but some evidence that code that does unusual stuff is more likely to have bugs. 4/
So why do we still feel tempted to build fallback into our systems? Not failover, or replication, but fallback to a completely separate mode of operation? That seems to be because it's tempting! After all, who doesn't want their system to be able to handle failures? 5/
Who doesn't want to be the person to wrote the code that pulls a rabbit out of a hat when all else is lost? Who wants to be accused of not thinking about, and planning for a failure case? 6/
That's the first-order effect. The second-order one is what happens as we fix stuff. We do COEs or post-mortems, which is good. The obvious conclusion to a COE is "fix the system so that failure X doesn't break failure Y next time". Add complexity. Add fallback. 7/
It's much harder to write a COE that advocates for deleting code, or simplifying. So systems build up scar-tissue over time. Often that really helps, but the complexity, and modes are a real liability. 8/
To counter this second-order effect you need to think about the large-scale impact of your COE action items. Thinking holistically about the whole system, will they reduce or increase future problems? If they add complexity, is it worth it? 9/
Fallback modes are seldom worth it, whether they were designed in from the beginning, or added later.
You can follow @MarcJBrooker.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.