Docs » Getting Started » Concepts » About Detectors and Alerts

About Detectors and Alerts ๐Ÿ”—

The Infrastructure Monitoring data model includes metadata that helps users select and use data. Infrastructure Monitoring uses SignalFlow programs to generate charts and detector alerts. When users design charts or detectors, Infrastructure Monitoring automatically creates the necessary SignalFlow programs. Customers can also modify these programs.

Detectors ๐Ÿ”—

In Infrastructure Monitoring, detectors evaluate metric time series (MTS) against a specific condition over a period of time. MTS can contain raw data or the output of an analytics function.

Alerts ๐Ÿ”—

When data in an input MTS matches a condition, the detector generates a trigger event and an alert that has a specific severity level. You can configure an alert to send a notification using third-party systems such as PagerDuty or Slack. You can also configure an alert to display notifications in the Infrastructure Monitoring user interface.

Using metadata in detectors ๐Ÿ”—

The metadata associated with MTS can be used to make detector definition simpler, more compact, and resilient to change.

For example, if you have a group of 30 virtual machines that are used to provide a clustered service like Kafka, you will normally have included the dimension service:kafka with all of the metrics coming from those virtual machines.

In this case, if you want to track whether cpu.utilization` remains below 80 for each of those virtual machines, you can create a single detector that queries for cpu.utilization metrics that includes the service:kafka dimension and evaluates them against the threshold of โ€œ80โ€. This detector will trigger individual alerts for each virtual machine whose cpu.utilization exceeds the threshold, as if you had 30 separate detectors, but you do not need to create 30 individual detectors - just the one.

In addition, if the population changes โ€” because the cluster has grown to 40 virtual machines โ€” you do not need to make any changes to your detector. Provided you have included the service:kafka dimension for the newly added virtual machines, the existing detectorโ€™s query will find them and include them in the threshold evaluation.

Dynamic threshold conditions ๐Ÿ”—

Setting static values for detector conditions can lead to noisy alerting because the appropriate value for one service or for a particular time of day may not be suitable for another service or a different time of day. For example, if your applications or services contain an elastic infrastructure, like Docker containers or EC2 autoscaling, the values for your alerts might vary by time of day.

You can define Dynamic thresholds to account for changes in streaming data. For example, if your metric exhibits cyclical behavior, you can define a threshold that is a one-week timeshifted version of the same metric. Suppose the relevant basis of comparison for your data is a populationโ€™s behavior like a clustered service. In that case, you can define your threshold as a value that reflects that behavior. For example, the 90th percentile for the metric across the entire cluster over a moving 15-minute window.

For more information ๐Ÿ”—

For information about creating and using detectors, see Detectors and Alerts.