In this hands-on exercise, we will troubleshoot a Kubernetes deployment issue within the web
namespace. Starting with an overview of all resources, we identify that a pod is stuck in an ImagePullBackOff
state—a common issue when the specified container image is unavailable or incorrectly tagged.This exercise not only covers essential Kubernetes commands but also reinforces key concepts like diagnosing pod issues, editing deployments, and validating changes in a live cluster.
- We will list out the namespaces to get a glimpse of our infrastructure
kubectl get all --all-namespaces
.
- We will use
kubectl get svc,po,deploy -n web
- This will help us get information about services, pods, and deployments within the web namespace.
- So in this case, we will perform an investigation on any pod listed here, let's do it on the first one. We will need to execute the following command:
kubectl describe pod nginx-856876659f-f9cqq -n web
We need to edit the pod image according to the error messages. To edit the deployment, execute the following command:kubectl edit deploy nginx -web
- Redeploy by executing
kubectl edit deploy nginx -n web
. Delete the :191, hit escape, and:wq!
to exit out of the editor. - Now, let's verify that these changes have gone into effect
kubectl get rs -n web
. We should get a list of all the pods that belong to the new replica set.
- Let's list our pods to get the IP addresses
kubectl get po -n web -o wide
.
Now let's spin up a busy box to test one of these pods' health. kubectl run busybox --image=busybox --rm -it --restart=Never – sh
- We will then call the pod
wget -qO- 10.244.2.12:80
which belongs to our first one.
- With this, we can conclude we have fixed the broken pods and can connect to the nginx service successfully.