Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Prometheus, commonly used for Kubernetes but which also supports other cloud native environments, is a set of open-source tools for monitoring and alerting in containerized and microservices-based environments. With Kubernetes Prometheus monitoring, you can configure live notification feeds and run flexible monitoring queries. It provides visibility into containerized applications, APIs, and workloads, which are typically difficult to observe given their complex, distributed nature.
Prometheus can also help you implement security for cloud-native applications by detecting unusual traffic and behaviors that may indicate a threat or escalate into a cyberattack.
This is part of a series of articles about Kubernetes monitoring.
Prometheus uses pull requests to retrieve information. It works by sending HTTP scrape requests based on the deployment file’s defined configuration. It then parses and stores the responses to scrape requests alongside the relevant Kubernetes metrics. Prometheus uses a custom database to store cluster information, allowing it to handle large volumes of data. This system allows you to simultaneously monitor thousands of virtual machines from the same server.
Before Prometheus can collect data, you need to make sure that it is correctly formatted and exposed. It can retrieve data directly from an application’s client library or through an exporter. An exporter is a software component that sits alongside an application to help manage data that you cannot fully control. Exporters can accept the HTTP requests that Prometheus sends, ensure that the data is in a Prometheus-supported format, and return the relevant data to the central server.
With exporters attached, each application can return data to Prometheus, but you still need to tell Prometheus where to find the data. Prometheus uses service discovery to identify targets for scraping data.
A Kubernetes cluster should already have labels and tags, making it easier to keep track of each element’s status and any changes. The Kubernetes API enables Prometheus to discover data targets, including:
Prometheus retrieves machine and application metrics separately. It is therefore necessary to use node exporters to expose information such as CPU, memory, bandwidth, and disk metrics. You also need to expose metrics related to cgroups. The easiest option for this is cAdvisor, a built-in node-level exporter in Kubernetes.
After Prometheus has collected the data, you can use PromQL to view and share it. This query language lets you export data to a graphical interface like Grafana or send alerts using Alertmanager.
Itiel Shwartz
Co-Founder & CTO
In my experience, here are tips that can help you maximize the effectiveness of Prometheus for monitoring Kubernetes:
Use Prometheus federation to scale your monitoring setup by aggregating data from multiple Prometheus instances. This allows for centralized monitoring of large environments and ensures that data collection remains efficient and manageable.
Set appropriate data retention policies to balance the need for historical data with storage costs. Use remote storage solutions for long-term retention if necessary, to offload the main Prometheus server and improve performance.
Deploy Prometheus in a high-availability configuration to ensure continuous monitoring even during server failures. This typically involves running multiple Prometheus instances with the same configuration and using a load balancer to distribute scrape requests.
Define recording rules in Prometheus to pre-compute frequently queried metrics. This reduces query load and speeds up dashboard rendering, especially for complex or resource-intensive queries.
Set up monitoring for your Prometheus servers to track their health and performance. Use alerting rules to notify you of any issues such as high CPU usage, large query execution times, or storage bottlenecks.
Among the advantages of Prometheus are:
Among the disadvantages of Prometheus are:
Here are some best practices to make the most of Prometheus to monitor Kubernetes.
Labels allow you to specify the data and context for your metrics. However, each set of labels takes up resources, such as CPU, RAM, bandwidth, and disk space. While insignificant at a small scale, the resources consumed can build up for large Kubernetes projects, driving up costs.
It is best to limit the labels for each metric to 10. Few metrics will need any labels. If some of your metrics have too many labels, you might benefit from using a dedicated analysis tool instead.
A common mistake is to use timestamps to indicate the time lapsed since an event occurred. When tracking the timing of events, you should only use timestamps that mark when each event happened. This approach will eliminate the need to update the logic and minimize errors. You can also determine the time lapses since by calculating:
time() - my_timestamp_metric
You should limit the number of metrics included in critical or frequently called code (i.e., over 100k calls per second). It usually takes 12-17ns for a Java application to increment counters, resulting in performance issues when compounded. By limiting the number of metrics called in the inner loops (and using fewer labels), you can prevent such issues.
When you do need labels, you can cache the label results to minimize their impact. It is also important to be careful when using time and duration metrics because these measurements require syscalls.
There are four key types of metric to use in Prometheus, and it is important to know when to use each type of metric to ensure the most accurate, complete insights:
Prometheus has great advantages, but the biggest benefits are that it’s free, relatively easy to understand, and provides good Kubernetes observability into your stack. On the other hand, you have to invest time in manually managing Prometheus as well as money in hosting it. You’ll also have to spend quite a bit of engineering resources to scale Prometheus beyond a certain range – and as we all know, with scale comes complexity – meaning, the chances of things breaking down are more likely to occur.
This is where Komodor comes in as a native K8s platform, helping you monitor your entire K8s stack, identify issues, uncover their root cause and understand the necessary action to troubleshoot efficiently and independently. To learn more about how Komodor can make it easier to empower you and your teams to troubleshoot K8s, sign up for our free trial.
Share:
How useful was this post?
Click on a star to rate it!
Average rating 5 / 5. Vote count: 7
No votes so far! Be the first to rate this post.
and start using Komodor in seconds!