Alerting with Alertmanager

Collecting metrics is great, but when things go south, or ideally BEFORE things go south, you want to get notified.

This is where Alertmanager by Prometheus comes into the picture.

The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.

How Alertmanager works

  • An alert is generated using the alerting rules in the Prometheus servers & is pushed to the Alertmanager.
  • The Alertmanager then manages those alerts, including silencing, inhibition, grouping, and sending out notifications via methods such as email & other notification services.

 Alertmanager handles alerts sent by the Prometheus server and then routes them to the   receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of   deduplicating, grouping, silencing, and inhibition of alerts.

Core concepts implemented by Alertmanager

  • Silences: This mute alerts for a given time. Incoming Alerts are checked to match against active silent alerts, & if matched then no notification will be sent out.
  • Inhibition: They suppress notifications for certain alerts if certain other alerts are already firing.
  • Grouping: It groups alerts of similar nature into a single notification. This is very useful when many systems fail at once & thousands of alerts may be firing simultaneously.

Why is Alerting necessary?

Automated alerts are essential to monitoring. They allow you to spot problems anywhere in your infrastructure so that you can rapidly identify their causes and minimize service degradation and disruption.

If metrics and other measurements facilitate observability, then alerts draw human attention to the particular systems that require observation, inspection, and intervention.

Alertmanager in DataVision

This is the Alertmanager dashboard. It shows all the alerts with any down nodes. Now you can monitor your cluster and the alerts all in one place.

In alerts, you can see the following metrics related to that alert:-

  • Time: When the alert was generated
  • Alertname: Name of the alert
  • Device, namespace, instance, pod: Which resource has generated the alert
  • Severity: The severity of the alert, where 1 – Info, 2 – Warning, 3 – Critical
  • Description: The description of the alert

We’ve not only incorporated alertmanager in DataVision but have also provided a dashboard with it, which makes it easy to manage your cluster.

Tags

What do you think?

Related articles

Contact us

There’s more to Tech than you have experienced!

Get in touch with us to know the possibilities. We’re happy to describe and design custom Tech solutions after understanding your business goals and needs.

Call us at :

Your benefits:
What happens next?
1

Schedule a Call at Your Convenience

2

Discovery and Consulting Meeting

3

Project Plan & proposal preparation

Schedule a Free Consultation