[Supported since Kublr KCP 1.22]


By default Kublr provide a banch of custom alers for usage out the box. They are disctibed in "Default AlertManager rules in Kublr" page, but in case if you need to add your own or modify current feel free to do that throauth KCP cluster specification.

There are few different ways to add new or override current rules:

  • Add new rules for Alerts
  • Override specific Alerts
  • Disable all default Kublr rules for Alerts with some exeptions


Add new rules for Alerts :

spec:
  features:
  ...
    monitoring:
    ...
      values:
        alertmanager:
          alerts:
            - alert: SAMPLE1
              annotations:
                description: Count UP
                summary: Test Alert for example
              expr: count(up) > 0
              for: 10s
              labels:
                feature: pod
                level: pod
                severity: critical


Override specific alerts:

spec:
  features:
  ...
    monitoring:
    ...
      values:
        prometheus:
          rules:
            overrides:
              - alert: SSLCertExpiredWarning #Alert name for overwriting
                annotations:
                  description: SSL certificate for host {{ $labels.host }} in cluster {{ $labels.kublr_cluster }} expired in less than 7 days!
                  summary: SSL Certificate expiration issue - This rule overridden & you can change settings for it as well
                expr: (nginx_ingress_controller_ssl_expire_time_seconds{} - time()) < 86400 * 7
                for: 60m
                labels:
                  feature: ingress
                  level: cluster
                  severity: warning
                  silence_group: SSLCertExpired


Disable all default Kublr rules with some exeptions:

prometheus:
  rules:
    overrides:

      # disable all default Kublr rules except one
      - _disable: true
        _match:
          alertNot: RabbitmqUnactiveExchange


The full list of possible override settings is shown below: 

prometheus:
  rules:
    overrides: # Can be array or object - [] | {}
      override1:
        _match:
          group:"group name" # optional, can be string or array of strings
          groupNot:"group name" # optional, can be string or array of strings
          groupRegexp:"group name regexp" # optional, can be string or array of strings
          groupRegexpNot:"group name regexp" # optional, can be string or array of strings
          alert: "alert name" # optional, can be string or array of strings
          alertNot:"alert name" # optional, can be string or array of strings
          alertRegexp:"alert name regexp" # optional, can be string or array of strings
          alertRegexpNot:"alert name regexp" # optional, can be string or array of strings
          rule:"rule name" # optional, can be string or array of strings
          ruleNot:"rule name"  # optional, can be string or array of strings
          ruleRegexp:"rule name regexp"  # optional, can be string or array of strings
          ruleRegexpNot:"rule name regexp"  # optional, can be string or array of strings
          template:"go template to calculate match"
          templateNot:"go template to calculate match"
        _disable: false  # use to disable matching rules
        _template: false # if true, treat the override values as templates (contexts includes the standard values
                         # like .Values, .Release etc, and the original rules parsed: .RulesFile, RulesGroup, .Rule)
        _rule: {...} # override rule, can be object or string (yaml)
        ... # in most cases the override properties can be specified directly here,
            # but if for some reason in the future there is a property name conflict (e.g. "_template" property
            # needs to be used in the rule), or if the rule is better specified as string, the override can be placed in the "_rule" property

Additional information about how to add some parameters into Prometheus and AlertManager you can find here