Newly introduced `critical-op` PDB causes tons of alerts with kube-prometheus-stack

The newly introduced `postgres-<instance>-critical-op-pdb` is created with the maximum of healthy instances and a selector not matching any instances in normal operation. When using the kube-prometheus-stack in its default configuration, this raises an alarm because the PDB does not have any healthy pods:

> PDB does not have enough healthy pods.
> PDB /keycloak/postgres-keycloak-pg-critical-op-pdb expects 3 more healthy pods. The desired number of healthy pods has not been met for at least 15m.

Alert-Rule is defined here: [`templates/prometheus/rules-1.14/kubernetes-apps.yaml#L568-L601`](https://github.com/prometheus-community/helm-charts/blob/4f332b4147dcb1890bd2ca0ea9c33ae09dd6bfdc/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kubernetes-apps.yaml#L568-L601)

Though after the first "what's going on here?!?" this is not critical, it causes a lot of noise in alerting and mitigation is either

- to disable PDB for postgres-operator
- disable PDB monitoring
- introduce silences, muting the `postgres-<instance>-critical-op-pdb` alerts

which is not ideal, as in all cases parts are missing or alerts are not seen when it's important to see them.

A better solution would be, for example, to create those PDBs only on demand and not to let them stick around all the time. That way the alerts would be meaningful and not noisy while retaining the use case those PDBs fulfill.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Newly introduced `critical-op` PDB causes tons of alerts with kube-prometheus-stack #3020

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Newly introduced critical-op PDB causes tons of alerts with kube-prometheus-stack #3020

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Newly introduced `critical-op` PDB causes tons of alerts with kube-prometheus-stack #3020