[Supported since Kublr KCP 1.22]
By default Kublr provide a banch of custom alers for usage out the box. They are disctibed in "Default AlertManager rules in Kublr" page, but in case if you need to add your own or modify current feel free to do that throauth KCP cluster specification.
There are few different ways to add new or override current rules:
- Add new rules for Alerts
- Override specific Alerts
- Disable all default Kublr rules for Alerts with some exeptions
Add new rules for Alerts :
spec:
features:
...
monitoring:
...
values:
alertmanager:
alerts:
- alert: SAMPLE1
annotations:
description: Count UP
summary: Test Alert for example
expr: count(up) > 0
for: 10s
labels:
feature: pod
level: pod
severity: criticalOverride specific alerts:
spec:
features:
...
monitoring:
...
values:
prometheus:
rules:
overrides:
- alert: SSLCertExpiredWarning #Alert name for overwriting
annotations:
description: SSL certificate for host {{ $labels.host }} in cluster {{ $labels.kublr_cluster }} expired in less than 7 days!
summary: SSL Certificate expiration issue - This rule overridden & you can change settings for it as well
expr: (nginx_ingress_controller_ssl_expire_time_seconds{} - time()) < 86400 * 7
for: 60m
labels:
feature: ingress
level: cluster
severity: warning
silence_group: SSLCertExpiredDisable all default Kublr rules with some exeptions:
prometheus:
rules:
overrides:
# disable all default Kublr rules except one
- _disable: true
_match:
alertNot: RabbitmqUnactiveExchangeThe full list of possible override settings is shown below:
prometheus:
rules:
overrides: # Can be array or object - [] | {}
override1:
_match:
group:"group name" # optional, can be string or array of strings
groupNot:"group name" # optional, can be string or array of strings
groupRegexp:"group name regexp" # optional, can be string or array of strings
groupRegexpNot:"group name regexp" # optional, can be string or array of strings
alert: "alert name" # optional, can be string or array of strings
alertNot:"alert name" # optional, can be string or array of strings
alertRegexp:"alert name regexp" # optional, can be string or array of strings
alertRegexpNot:"alert name regexp" # optional, can be string or array of strings
rule:"rule name" # optional, can be string or array of strings
ruleNot:"rule name" # optional, can be string or array of strings
ruleRegexp:"rule name regexp" # optional, can be string or array of strings
ruleRegexpNot:"rule name regexp" # optional, can be string or array of strings
template:"go template to calculate match"
templateNot:"go template to calculate match"
_disable: false # use to disable matching rules
_template: false # if true, treat the override values as templates (contexts includes the standard values
# like .Values, .Release etc, and the original rules parsed: .RulesFile, RulesGroup, .Rule)
_rule: {...} # override rule, can be object or string (yaml)
... # in most cases the override properties can be specified directly here,
# but if for some reason in the future there is a property name conflict (e.g. "_template" property
# needs to be used in the rule), or if the rule is better specified as string, the override can be placed in the "_rule" propertyAdditional information about how to add some parameters into Prometheus and AlertManager you can find here