Introduction
Let’s talk about Kubernetes probes and why they matter in your deployments. When managing production-facing containerized applications, even small optimizations can have enormous benefits.
Aiming to reduce deployment times, making your applications better react to scaling events, and managing the running pods healthiness requires fine-tuning your container lifecycle management. This is exactly why proper configuration — and implementation — of Kubernetes probes is vital for any critical deployment. They assist your cluster to make intelligent decisions about traffic routing, restarts, and resource allocation.
Properly configured probes dramatically improve your application reliability, reduce deployment downtime, and handle unexpected errors gracefully. In this article, we’ll explore the three types of probes available in Kubernetes and how utilizing them alongside each other helps configure more resilient systems.
Quick refresher
Understanding exactly what each probe does and some common configuration patterns is essential. Each of them serves a specific purpose in the container lifecycle and when used together, they create a rock-solid framework for maintaining your application availability and performance.
Startup: Optimizing start-up times
Start-up probes are evaluated once when a new pod is spun up because of a scale-up event or a new deployment. It serves as a gatekeeper for the rest of the container checks and fine-tuning it will help your applications better handle increased load or service degradation.
Sample Config:
startupProbe:
httpGet:
path: /health
port: 80
failureThreshold: 30
periodSeconds: 10
Key takeaways:
- Keep
periodSeconds
low, so that the probe fires often, quickly detecting a successful deployment. - Increase
failureThreshold
to a high enough value to accommodate for the worst-case start-up time.
The Startup probe will check whether your container has started by querying the configured path. It will additionally stop the triggering of the Liveness and Readiness probes until it is successful.
Liveness: Detecting dead containers
Your liveness probes answer a very simple question: “Is this pod still running properly?” If not, K8s will restart it.
Sample Config:
livenessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 10
failureThreshold: 3
Key takeaways:
- Since K8s will completely restart your container and spin up a new one, add a
failureThreshold
to combat intermittent abnormalities. - Avoid using
initialDelaySeconds
as it is too restrictive — use a Start-up probe instead.
Be mindful that a failing Liveness probe will bring down your currently running pod and spin up a new one, so avoid making it too aggressive — that’s for the next one.
Readiness: Handling unexpected errors
The readiness probe determines if it should start — or continue — to receive traffic. It is extremely useful in situations where your container lost connection to the database or is otherwise over-utilized and should not receive new requests.
Sample Config:
readinessProbe:
httpGet:
path: /health
port: 80
periodSeconds: 3
failureThreshold: 1
timeoutSeconds: 1
Key takeaways:
- Since this is your first guard to stopping traffic to unhealthy targets, make the probe aggressive and reduce the
periodSeconds
. - Keep
failureThreshold
at a minimum, you want to fail quick. - The timeout period should also be kept at a minimum to handle slower Containers.
- Give the
readinessProbe
ample time to recover by having a longer-runninglivenessProbe
.
Readiness probes ensure that traffic will not reach a container not ready for it and as such it’s one of the most important ones in the stack.
Putting it all together
As you can see, even if all of the probes have their own distinct uses, the best way to improve your application’s resilience strategy is using them alongside each other.
Your startup probe will assist you in scale up scenarios and new deployments, allowing your containers to be quickly brought up. They’re fired only once and also stop the execution of the rest of the probes until they successfully complete.
The liveness probe helps in dealing with dead containers suffering from non-recoverable errors and tells the cluster to bring up a new, fresh pod just for you.
The readiness probe is the one telling K8s when a pod should receive traffic or not. It can be extremely useful dealing with intermittent errors or high resource consumption resulting in slower response times.
Additional configurations
Probes can be further configured to use a command in their checks instead of an HTTP request, as well as giving ample time for the container to safely terminate. While these are useful in more specific scenarios, understanding how you can extend your deployment configuration can be beneficial, so I’d recommend doing some additional reading if your containers handle unique use cases.
Further reading:
Liveness, Readiness, and Startup Probes
Configure Liveness, Readiness and Startup Probes