Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Alerts are everywhere: we get them from messaging apps, email, and social accounts – even from streaming services. Sometimes it can feel like we spend a huge amount of time prioritizing these alerts – prioritizing what to read and what to answer. At some point, the feeling of being bombarded by alerts, many often unnecessary or less important, can desensitize you. However, once you suffer from alert fatigue, you might end up missing an urgent alert, and the consequences can be detrimental.
Nothing can be truer for DevOps and developers, who are even more strained by excess alerts – burdened with dozens of IT alerts constantly pouring in from multiple sources on top of the day-to-day alerts they already receive. More and more alerts pile up every year, with the number of software services deployed by firms increasing constantly. Between 2016 to 2019, for example, companies increased their software services by 68% on average, reaching about 120 services per company, according to an analysis by Okta.
Alert fatigue is a symptom of the use of technology, but its origin and its solution actually both belong to the domain of management, organizational development, and psychology.
My startup, Komodor, is a vibrant, fast-growing, young startup with no legacy systems or out-of-date code. Still, there came a point when some of our teams’ coding failures went unnoticed, and ultimately we came close to suffering real damage by ignoring one of the many alerts that we had received. In our case, being so young and working hard to keep our code “clean” meant we also thought that we had calibrated our alerts properly. But if we missed such an important one, then something wasn’t quite right.
Coding failures are a natural part of the development process in every tech company. But an excess of alerts happens because teams don’t calibrate them properly. By making sure that alerts only appear when they should, from the get-go, one can prevent alert fatigue and most of the associated collateral damages.
Alert fatigue can cause enormous damage to companies. To cope with this properly, we must first understand the different contributing factors:
DevOps teams often deal with alert fatigue by way of a mix of actions and procedures. These can include implementing unified alert management systems, increasing the staff that manages alerts, and adopting more proactive and qualitative management of on-call engineers.
However, to cope with alert fatigue, I recommend building a preemptive strategy first and foremost. The strategy should be based on a thorough, yet simple, alert auditing process. Such a process should address mainly cultural and human engineering issues, and be enhanced with simple off-the-shelf steps:
A visualization of the alerts’ data example
Komodor tracks changes throughout your entire system, correlates and analyzes the complex dependencies of your system, and helps you discover the source of an alert and its effects. Doing this will enable you to kill two birds with one stone: reduce the fear of a meltdown before releasing a new feature, and sort out malfunctions in a calm, controlled environment. Komodor also allows you to estimate time-to-solve issues, thus addressing the lost man-hours issue.
Like I discussed here, most alerts originate from human errors: either the engineers who created a code malfunction or because the alert wasn’t calibrated properly. For these reasons, it’s important to address these problems with better organizational processes. At the same time, it is also important to use technology that can ensure peace of mind and promote confidence by dealing with serious alerts according to a process, alleviating psychological effects of alert fatigue such as burnout and fear of failure.
Share:
How useful was this post?
Click on a star to rate it!
Average rating 5 / 5. Vote count: 6
No votes so far! Be the first to rate this post.
and start using Komodor in seconds!