Hub defaultRules groups kubernetesSystemControllerManager, kubeScheduler, and
kubernetesSystemScheduler used wrong key 'enabled: false' — chart expects 'create: false'.
This caused KubeControllerManagerDown/KubeSchedulerDown to fire as false positives
because OTC CCE managed k8s does not expose control plane for scraping.
Dev local vmagent had empty externalLabels, so backup-alert rules evaluated by local
vmalert had no cluster_environment label on kube_job_status_failed metrics. Added
cluster_environment=dev to match what the vm-client-stack vmagent adds for hub shipping.
- Enable VMAgent (was disabled → no metrics scraped)
- Remove disable_login from Grafana config; add security block so operator can auth via API
- Disable VMAuth (invalid trailing-dot hostname o12y.observability.; not needed on dev)
Pre-upgrade cleanup hook uses bitnami/kubectl and spawns on every ArgoCD
sync. Dev cluster nodes are at 99% CPU / pod limit — hook pod cannot be
scheduled, blocking the entire sync indefinitely.
Disabling cleanup.enabled prevents the hook Job from being created.
CRD cleanup is safe to skip on a fresh bootstrap where no old CRDs exist.