fix(observability): 🐛 disable false-positive control-plane alerts and fix empty cluster_environment label
Hub defaultRules groups kubernetesSystemControllerManager, kubeScheduler, and kubernetesSystemScheduler used wrong key 'enabled: false' — chart expects 'create: false'. This caused KubeControllerManagerDown/KubeSchedulerDown to fire as false positives because OTC CCE managed k8s does not expose control plane for scraping. Dev local vmagent had empty externalLabels, so backup-alert rules evaluated by local vmalert had no cluster_environment label on kube_job_status_failed metrics. Added cluster_environment=dev to match what the vm-client-stack vmagent adds for hub shipping.
This commit is contained in:
parent
32e998df5b
commit
0316eefa43
2 changed files with 5 additions and 4 deletions
|
|
@ -708,7 +708,8 @@ vmagent:
|
|||
port: "8429"
|
||||
selectAllByDefault: true
|
||||
scrapeInterval: 20s
|
||||
externalLabels: {}
|
||||
externalLabels:
|
||||
cluster_environment: "dev"
|
||||
# For multi-cluster setups it is useful to use "cluster" label to identify the metrics source.
|
||||
# For example:
|
||||
# cluster: cluster-name
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue