Thread by @copyconstruct, OH: Health checks are like bloom filters. A failing health check means [...]

Cindy Sridharan

copyconstruct

OH: Health checks are like bloom filters. A failing health check means a service isn't up, but a health check passing means the service is *probably* "healthy", especially given how health checks are typically done (host level checks or hitting an HTTP endpoint/RPC method).

Which begs the question - how do we even define "health"? For all the talk about embracing failure, partial failures being the bane of distributed systems etc, there doesn't seem to be a way to encode that in health checks, which treats the status as a strictly binary outcome.

Though I guess the folks doing Kubernetes can leverage a "liveness probe" to configure more interesting checks?

Or perhaps not, since a liveness probe failure results in a pod restart.

You can follow @copyconstruct.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: