fix(observability): 🐛 disable false-positive control-plane alerts and fix empty cluster_environment label
Hub defaultRules groups kubernetesSystemControllerManager, kubeScheduler, and kubernetesSystemScheduler used wrong key 'enabled: false' — chart expects 'create: false'. This caused KubeControllerManagerDown/KubeSchedulerDown to fire as false positives because OTC CCE managed k8s does not expose control plane for scraping. Dev local vmagent had empty externalLabels, so backup-alert rules evaluated by local vmalert had no cluster_environment label on kube_job_status_failed metrics. Added cluster_environment=dev to match what the vm-client-stack vmagent adds for hub shipping.
This commit is contained in:
parent
32e998df5b
commit
0316eefa43
2 changed files with 5 additions and 4 deletions
|
|
@ -708,7 +708,8 @@ vmagent:
|
||||||
port: "8429"
|
port: "8429"
|
||||||
selectAllByDefault: true
|
selectAllByDefault: true
|
||||||
scrapeInterval: 20s
|
scrapeInterval: 20s
|
||||||
externalLabels: {}
|
externalLabels:
|
||||||
|
cluster_environment: "dev"
|
||||||
# For multi-cluster setups it is useful to use "cluster" label to identify the metrics source.
|
# For multi-cluster setups it is useful to use "cluster" label to identify the metrics source.
|
||||||
# For example:
|
# For example:
|
||||||
# cluster: cluster-name
|
# cluster: cluster-name
|
||||||
|
|
|
||||||
|
|
@ -201,13 +201,13 @@ defaultRules:
|
||||||
enabled: true
|
enabled: true
|
||||||
rules: {}
|
rules: {}
|
||||||
kubernetesSystemControllerManager:
|
kubernetesSystemControllerManager:
|
||||||
enabled: false
|
create: false
|
||||||
rules: {}
|
rules: {}
|
||||||
kubeScheduler:
|
kubeScheduler:
|
||||||
enabled: false
|
create: false
|
||||||
rules: {}
|
rules: {}
|
||||||
kubernetesSystemScheduler:
|
kubernetesSystemScheduler:
|
||||||
enabled: false
|
create: false
|
||||||
rules: {}
|
rules: {}
|
||||||
kubeStateMetrics:
|
kubeStateMetrics:
|
||||||
enabled: true
|
enabled: true
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue