Thread by @mattomata, One of my favorite, but very subtle things that @KnativeProject Serving simplifies [...]

Matt Moore

mattomata

One of my favorite, but very subtle things that @KnativeProject Serving simplifies about Kubernetes development is draining pods, which is a startlingly hard thing to get right, as evidenced by the multitude of articles on the subject.

When K8s deletes pods, it first sends the containers a SIGTERM, and after the terminationGracePeriodSeconds (default 30s) it sends them a SIGKILL.

Probably the most common pitfall here is to not handle SIGTERM at all, so the typical failure mode is that things sit around for 30s and get a SIGKILL.

Fortunately K8s removes pods from Endpoints on SIGTERM, but what if it was your last pod?

With just K8s Services this means 503s as a new Pod is spun up.

With Knative, we protect services with just a handful of Pods the same way we handle scaling to zero. When that single Pod disappears with Knative, our "activator" is already there to avoid dropping requests while the new Pod comes online.

Eventually folks handle SIGTERM, but they typically do this by immediately terminating the process (hereafter the "YOLO exit").

K8s hasn't drained the endpoints, so it will keep routing traffic to the missing Pods. You guessed it: 503s.

With Knative, we set a "prestop" hook on the user container that tells our sidecar that K8s is trying to shut things down. This lets us fail readiness probes in the queue-proxy BEFORE the SIGTERM is delivered to the user container.

So even if you "YOLO exit"? no 503s.

One other odd thing we do is to set a *higher* default terminationGracePeriodSeconds, which is often a source of questions (you can configure this down).

We set this value to match the `timeoutSeconds` field we surface on our Revisions.

This is another safeguard so that folks that improperly implement SIGTERM with "YOLO exit" don't cancel outstanding requests.

Once our sidecar receives the prestop hook, it waits for all outstanding requests to close before letting the SIGTERM through.

In a nutshell, we've made it nearly impossible for an improperly implemented K8s lifecycle to result in dropped traffic, and made implementing things properly (in the context of @KnativeProject) super simple: "YOLO exit".

So ask yourself: "Is this what I want my developers spending their time on?"

If not, then let @KnativeProject handle this for you, and "YOLO exit" away.

LOL I missed one! Since our abstraction is built around a well known network-path, we also help avoid drops as things come up!

By default we perform a TCP readiness probe to ensure your container is listening before it starts showing up in Endpoints.

Without a readiness probe, your container will start showing up in Endpoints as soon as the kubelet has started your containers. So your container starting its HTTP server is racing with K8s networking layer distributing your Pod's IP.

You can follow @mattomata.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: