In this topic let's discuss only troubleshooting K8s Cluster. Troubleshooting in K8s is a bigger topic. As we are mainly dealing learnt about clusters, we will learn about monitoring, troubleshooting and debugging them.
Sections of troubleshooting
Debugging your application
Debugging your cluster
Debugging application
Here we will cover the issues that are related to deployed applications in K8s cluster following workloads and container messages and debugging them.
Diagnosing the problem
The first step is to look where is the issue in the deployed application.
Debugging pods, service and RC
Check the current state of pods -
kubectl describe pods ${POD_NAME}
Validate the yaml file - kubectl apply --validate -f mypod.yaml
Check the current state of replication controller -
kubectl describe rc ${CONTROLLER_NAME}
Check for status of service -
kubectl get endpoints ${SERVICE_NAME}
Debugging Running Pods
To check the running pods and describe more on each pod
kubectl describe pod
/ Kubectl get pod
To get events related to the namespace
kubectl get events
kubectl get events --namespace=my-namespace
Examining pod logs
First, look at the logs of the affected container:
kubectl logs ${POD_NAME} ${CONTAINER_NAME}
Access the previous container's crash log with:
kubectl logs --previous ${POD_NAME} ${CONTAINER_NAME}
Debugging service
Once the service is exposed we will see the services using the below commands
kubectl get svc hostnames
Check if service endpoints exists
kubectl get endpoints hostnames
Check if kube-proxy is working
ps auxw | grep kube-proxy
Refer here to know more about debugging services.
Debugging StatefulSet
Install the kubectl command-line tool
List all the pods which belong to a StatefulSet
kubectl get pods -l
app.kubernetes.io/name=MyApp
If you find that any Pods listed are in Unknown
or Terminating
state for an extended period of time, they can be deleted
kubectl delete statefulsets <statefulset-name>
The associated headless service must also be deleted using
kubectl delete service <service-name>
Debug InitContainers
Checking the status of Init Containers
Check the status of your pod: kubectl get pod <pod-name>
Get details of init container: kubectl describe pod <pod-name>\
Debug Logs from Init container
- kubectl logs <pod-name> -c <init-container-2>
Get a Shell to run Container
- Get a shell to the running container: kubectl exec --stdin --tty shell-demo -- /bin/bash
Debugging Cluster
Listing your Cluster:
To get nodes: kubectl get nodes
To get detailed information about the overall health of your cluster, you can run: kubectl cluster-info dump
kubectl get nodes
kubectl describe node kube-worker-1
Looking at Logs
On systemd-based systems, you may need to use journalctl
instead of examining log files.
Control Plane nodes
/var/log/kube-apiserver.log
- API Server, responsible for serving the API/var/log/kube-scheduler.log
- Scheduler, responsible for making scheduling decisions/var/log/kube-controller-manager.log
- a component that runs most Kubernetes built-in controllers, with the notable exception of scheduling (the kube-scheduler handles scheduling).
Worker Nodes
/var/log/kubelet.log
- logs from the kubelet, responsible for running containers on the node/var/log/kube-proxy.log
- logs fromkube-proxy
, which is responsible for directing traffic to Service endpoints
Reference:
Thanks for reading my blog. Hope it helps to understand a few topics in K8s Cluster Troubleshooting.
Suggestions are always welcomed.
~~Saraa