Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Kubernetes has several components that produce logs and events containing information on everything that has happened in a Kubernetes cluster. Keeping track of all this data becomes extremely challenging when you run Kubernetes at a very large scale.
With so many components generating logs, organizations need a centralized place to see it all. But this is only half your problem. You also need to correlate logs coming from different components to draw the right conclusions and take effective actions.
Auditing thus takes on even greater importance in terms of keeping an eye on your system by:
This post will explore auditing, what Kubernetes events or logs you should be watching, and a few steps to take for establishing an effective auditing practice.
Since traditional VMs have a limited number of components, you can monitor them easily. Simply using a less or tail command will allow you to watch a few files like syslog, lastlog, and dpkg logs.
In Kubernetes, however, there are multiple components, with each component possibly running on different machines and producing its own logs and events.
So how do you achieve proper auditing in Kubernetes with so many things to keep an eye on? First, let’s discuss what effective auditing entails.
Getting auditing right allows you to answer the what, when, who, and where surrounding an event:
Your auditing will only be efficient when you know what to monitor for complete visibility. Let’s review everything your auditing should cover in a Kubernetes ecosystem.
The API server is the most important part of Kubernetes since all the other components talk to it to get the information they need to perform any action.
Monitoring the API server’s logs will help you discover any unwanted activity. Unfortunately, this can be tricky in managed deployments like AWS EKS, Azure Kubernetes, and GKE. If you have your own deployment, things are a bit easier, as you can review API server logs at defined locations.
When dealing with Kubernetes objects, each object typically has one of multiple controllers working on it. When these controllers perform an action, they emit events that are visible in Kubernetes APIs. You can retrieve these events using the following kubectl commands:
kubectl get events (to retrieve all events)
kubectl get events -n namespace (to retrieve all events for a namespace)
kubectl get events --watch (to stream events in real time)
kubectl get events --field-selector involvedObject.name=my-pod --field-selector involvedObject.kind=Pod (to retrieve events for a specific pod)
You can also view events at the end of the details when you run describe commands on any object.
The most basic entity deployed in Kubernetes is the container, and one or more containers combined are called pods. You can access container logs for pods using the following commands:
kubectl logs podname -n namespace -c containername
kubectl logs deployment/deploymentname -c containername
These are two basic examples. Visit Kubernetes documentation for a full list.
Each node in Kubernetes has a process running called a kubelet. This process is responsible for getting the actions from the API server and then executing them. Simply look up the logs for this process in your worker node and then tail them.
In systemd-based Linux systems, you can find them by using the command:
journalctl -u kubelet
Keeping an eye on metrics also plays an important role in identifying event triggers. For example, high CPU and memory can cause node eviction due to an out-of-memory (OOM) error or a health check failure.
Health checks on your pods are also critical. Making sure you have alerts for CPU, memory, and health checks will help you catch more than 80% of issues in production.
Monitoring all of the above is key to achieving efficient auditing, but it is only one part. Organizations must also establish a proper process and make sure it is adhered to across all departments. Below, we discuss a few steps to help you do this.
Compiling audit reports for your clusters and their overall health and performance, as well as regularly publishing them across your organization, establishes transparency. This enables everyone to identify actions (e.g., upgrades, optimizations, or cleanups) and then put them in their sprints.
A clearly defined process is critical for quickly identifying and fixing issues. One option is for everyone to create a Jira ticket and push it into the owner’s sprint when an issue is found. The owner will then address the problem, after which proper guardrails should be put in place to avoid recurrence.
For example, let’s say an issue is found where an application port was exposed to the public. A ticket will be created and addressed to DevOps to restrict the port. DevOps will then take action on it and create a step in the pipeline so that no such exposed ports can go to production.
Alerts make sure the right person is notified when an event occurs. The security team should be able to catch issues before they escalate via proper review; this again entails placing guardrails to prevent recurrence, emphasizing a proactive approach to security.
Your auditing tool should connect to an alerting tool such as PagerDuty and then send alerts out as a message via Slack or email based on their level of severity. For example, low- and medium-severity alerts can be sent via email, and high-severity ones can be sent via Slack, while the phone should be used for critical alerts.
Access control can be problematic if many developers are working on a Kubernetes cluster. It’s crucial to perform regular audits to make sure the principle of least privilege is being followed, which, in turn, helps avoid (unnecessary) elevated permissions.
There are tools to help with this. Komodor provides you with a centralized, cross-cluster view of what access has been given to what users. Its out-of-the-box policies and roles allow you to govern permissions to make sure users are granted the proper access.
None of the above matters if you don’t have all logs and events in one place, correlated, and presented in a single dashboard. It’s important to make sense of so much data and identify what is of primary importance. This is where an aggregated and correlated view will help.
When you’re running a production workload in Kubernetes, you have to be sure that nothing goes wrong in terms of security and compliance. You may by mistake expose a service to the public, or run multiple instances of a service in a node of the K8s cluster that goes down, crashing your app; these issues lead to the improper granting of elevated permissions, decreased resiliency, etc.
When done properly, auditing can help you avoid such problems and achieve a safer production environment. Establishing a proper auditing practice around logs and events ensures that no issue gets lost—and haunts you later on. To do this, developers must give the same attention to security and other Kubernetes-related issues as they do to their normal deliveries.
Komdor can help developers implement an effective auditing habit. It provides a centralized place to aggregate and correlate logs, identify action items, and alert the appropriate teams. Explore how Komodor can complement your Kubernetes strategy and book a demo today.
Share:
How useful was this post?
Click on a star to rate it!
Average rating 5 / 5. Vote count: 5
No votes so far! Be the first to rate this post.
and start using Komodor in seconds!