Commit graph

5 commits

Author SHA1 Message Date
7a6f96a8b4
feat(observability): add cluster heartbeat dead-man switch alerts
ClusterMetricsSilent: fires if no kubelet metrics for >10m (catches vmagent outages).
ClusterAPIServerDown: fires if apiserver scrape fails for >5m.
Replaces silenced KubeControllerManagerDown/KubeSchedulerDown which never fire on managed K8s.
2026-06-22 11:05:48 +02:00
3141b7bd6c
feat(observability): comprehensive platform alert rules
Replace ad-hoc forgejo/disk alerts with structured VMRule covering:
- platform-health: ForgejoDown, IngressHighErrorRate, NodeNotReady, PodCrashLooping
- storage: PVCUsageHigh (>80%), PVCUsageCritical (>90%)
- resources: NodeCPUHigh (>85%), NodeMemoryHigh (>90%)
2026-06-19 16:43:28 +02:00
Automated pipeline
464a9eb22e Automated upload for observability.buildth.ing 2026-03-04 09:55:46 +00:00
Automated pipeline
2820a37e00 Automated upload for observability.buildth.ing 2025-08-12 12:40:19 +00:00
Automated pipeline
625f2e0005 Initial upload 2025-07-21 12:52:28 +00:00