-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
The newly introduced postgres-<instance>-critical-op-pdb is created with the maximum of healthy instances and a selector not matching any instances in normal operation. When using the kube-prometheus-stack in its default configuration, this raises an alarm because the PDB does not have any healthy pods:
PDB does not have enough healthy pods.
PDB /keycloak/postgres-keycloak-pg-critical-op-pdb expects 3 more healthy pods. The desired number of healthy pods has not been met for at least 15m.
Alert-Rule is defined here: templates/prometheus/rules-1.14/kubernetes-apps.yaml#L568-L601
Though after the first "what's going on here?!?" this is not critical, it causes a lot of noise in alerting and mitigation is either
- to disable PDB for postgres-operator
- disable PDB monitoring
- introduce silences, muting the
postgres-<instance>-critical-op-pdbalerts
which is not ideal, as in all cases parts are missing or alerts are not seen when it's important to see them.
A better solution would be, for example, to create those PDBs only on demand and not to let them stick around all the time. That way the alerts would be meaningful and not noisy while retaining the use case those PDBs fulfill.