Throttling and false positives protection using min/max/avg

If you use Throttling on state items, you may encounter false positive alerts. This is because throttling does not allow you to use the min, max or avg functions to evaluate multiple values. This is because Zabbix discards the same, consecutive states.

Description of the issue

For state items such as Windows service statuses, you are most likely using filtering of the same values in a row (so-called throttling). You must have noticed that you cannot use Throttling, because the trigger, which is set to 3 consecutive values by default, will not be activated then. However, there is a solution to mitigate this situation. This solution allows you to save a significant amount of resources in large-scale systems by not storing irrelevant data. The instructions are based on the standard template for Windows services and serve as a general example. For example, if you have 1000 Windows servers, each running 100 services and checking every minute, then without this enhancement, an average of 1660 values will be retrieved every second. With this small change in configuration, the situation changes to an estimated 0-100 values per minute, depending on the frequency of service state changes. The numbers shown above are only an example and may vary in your environment.

Simple solution

You can use preprocessing to solve this issue. You can use it to manipulate the error value (by adding a time stamp) so that the error value is different each time. The error value therefore passes through throttling (discard unchanged). The value indicating the correct state is repeated and discarded using throttling (discard unchanged). You can make the settings in preprocessing as follows:

  • The first step adds a timestamp to the error value
  • The second step discards unchanged values
  • The third step extracts the original error value from the timestamp

Adding a timestamp is easy:

if (value == 0 ) {
  return value;
} else {
  return (Math.floor(Date.now() / 1000) - 1707000000 )*1000 + value;
}

After throttling, you get back the original value using the expression:

return value % 1000;

The above example works for error states from 1-999 and for a state with a value of 0 when the state is OK.

With the above preprocessing you can still use a trigger definition with min/max/avg functions and it will work as expected. This is because you new have error values in the standard interval:

min(//service.info["{#SERVICE.NAME}",state],#3)<>0