Commit graph

370 commits

Author SHA1 Message Date
81b721bb5a
fix(secrets-backup): 🔥 remove client-side openssl encryption
Some checks failed
Build secrets-backup image / build-and-push (push) Failing after 3s
OBS bucket has server-side KMS encryption. Client-side openssl was
redundant and caused failures (Alpine CDN unreachable at 03:30 UTC).

Changes:
- Dockerfile: remove openssl apk install (no longer needed)
- CronJob: remove openssl enc step, upload .tar.gz directly
- CronJob: remove secrets-backup-config Secret (encryption passphrase)
- CronJob: remove ENCRYPTION_PASSPHRASE env var
- Bump image tag to 1.0.1, update workflow and manifest reference

Flow: kubectl export → tar.gz → upload to OBS (SSE-KMS handles rest)

Ref: IPCEICIS-9317
2026-06-12 13:02:11 +02:00
6b29aa3916
fix(garm): ⬆️ bump image tag to v0.1.7-forgejo-24
v0.1.7-forgejo-23 had exec format error on amd64 nodes.
-24 is a rebuild from the same commit to produce correct multi-arch manifests.
2026-06-12 11:06:29 +02:00
5f4032bea6
fix(garm): 📌 pin to v0.1.7-forgejo-22 — -23 has wrong arch
v0.1.7-forgejo-23 produces exec format error on amd64 nodes.
Permanent fix until -24 is built correctly.
2026-06-12 10:23:51 +02:00
1f6e91b6ac
fix(secrets-backup): 🐛 add openssl install + upgrade image to 1.32.0
alpine/k8s:1.28.0 does not ship openssl. Script calls openssl enc
on line 116 causing exit 127 on every run.

Fix:
- apk add --no-cache openssl at script start (defensive, idempotent)
- upgrade image 1.28.0 -> 1.32.0 (kubectl client 5 minor versions behind
  cluster v1.33, outside supported skew of +/-1)
2026-06-12 09:33:20 +02:00
053acd7596
feat(observability): 📊 add backup failure alerting rules
VMRule alerts for forgejo-s3-backup and secrets-backup CronJobs:
- BackupCronJobNotScheduled (>26h since last run)
- BackupCronJobNeverScheduled (never ran)
- BackupJobFailed (job failed)
- BackupJobTooSlow (running >5min)

Ref: IPCEICIS-9313
Ref: IPCEICIS-2810
2026-06-08 15:07:14 +02:00
b087dac0f1
fix(core): 🐛 remove template vars from secrets-backup — use K8s secrets directly
The deploy workflow does not have BACKUP_ENCRYPTION_KEY/BACKUP_BUCKET/OBS_ENDPOINT
env vars. Redesigned to reference existing forgejo-cloud-credentials K8s secret
and hardcode OBS endpoint, matching the pattern of forgejo-s3-backup-cronjob.

Ref: IPCEICIS-9317
2026-06-08 14:02:04 +02:00
863bcd4883
feat(core): add secrets-backup CronJob as ArgoCD Application
Backs up critical K8s secrets (argocd, cert-manager, external-secrets)
to OBS. Uses template variables for environment-specific values.

Ref: IPCEICIS-9317
2026-06-08 13:12:18 +02:00
02308cf633
chore: bump garm image to v0.1.7-forgejo-23 (OOM detection) 2026-05-19 16:14:31 +02:00
Martin McCaffery
aaf9e6eade
bump garm-helm to v0.0.16 (RBAC fix)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-19 09:51:38 +02:00
Martin McCaffery
d857f155a8
feat(dex): add ci-sizer OIDC static client to template 2026-05-18 17:18:44 +02:00
Martin McCaffery
707a7b933a
feat(registry): add ci-sizer registry template 2026-05-18 16:27:56 +02:00
a8ce4c5c38
fix(sizer): 🐛 use internal K8s service URL for GARM connection
Switch GARM conditional from explicit GARM_URL env var to DOMAIN_GITEA
presence check. When Forgejo is deployed, GARM is always available at
its cluster-internal service (http://garm.garm.svc:80). Hardcode admin
user since GARM always uses that. GitLab-only deploys skip the block.

Ref: IPCEICIS-6886
2026-05-18 10:22:26 +02:00
32665ff620
fix(ci-sizer): 🐛 use safe map access for optional GARM_URL env var
`index .Env "GARM_URL"` returns empty string for missing keys instead
of panicking with "map has no entry for key".

Ref: IPCEICIS-6886
2026-05-18 10:15:58 +02:00
2a12a568ce
docs(stacks): 📝 add clarifying comments to stack templates
Document which components are required vs opt-in for deployment modes.

Ref: IPCEICIS-6886
2026-05-15 16:35:02 +02:00
d161b8ea4d
docs(ci-sizer): 📝 add opt-in comment to gitlab webhook app
Clarifies that the GitLab webhook ArgoCD app is optional and should
only be hydrated for clusters running GitLab Runner.

Ref: IPCEICIS-6886
2026-05-15 16:33:52 +02:00
fe51e8588c
feat(ci-sizer): add gitlab-webhook ArgoCD app to stacks template
Adds the mutating webhook deployment as a managed ArgoCD application
alongside the existing sizer-receiver. Includes deployment, service,
RBAC, cert-manager certificates, and webhook configuration.

Ref: IPCEICIS-6886
2026-05-15 16:30:42 +02:00
adf7f23685
fix(sizer): 🐛 make GARM env vars conditional in receiver deployment
Clusters without GARM lack the garm-fixed-credentials secret, causing
pod crash loops. The receiver already handles empty GARM_URL gracefully.

Ref: IPCEICIS-6886
2026-05-15 16:30:42 +02:00
Daniel.Sy
1f4489bd70 fix(ci-sizer): use getenv with default for SIZER_ALLOWED_ORG
Prevents gomplate crash when SIZER_ALLOWED_ORG is not set in environment.
Falls back to DevFW-CICD as default org.
2026-05-13 10:18:43 +00:00
5eaf4a761a
fix: increased s3 backup disk size 2026-05-07 17:48:17 +02:00
manuel.ganter
4f04de2543 Update template/stacks/forgejo/forgejo-server/manifests/forgejo-ingress.yaml
Requested to push bigger images
https://teams.microsoft.com/l/message/19:8cbad0f19e894c9296838715ef5ce72a@thread.v2/1777969188676?context=%7B%22contextType%22%3A%22chat%22%7D
2026-05-05 08:27:38 +00:00
52cb25a6f9
refactor(stacks): 🚚 migrate sizer-receiver from garm to ci-sizer namespace
Move sizer-receiver ArgoCD app and manifests from stacks/garm/ to
stacks/ci-sizer/. The sizer is provider-agnostic and no longer
belongs in the GARM-specific stack.

- destination namespace: garm → ci-sizer
- ArgoCD source path: stacks/garm/ → stacks/ci-sizer/
- ingress namespace: garm → ci-sizer
- GARM_URL unchanged (garm.garm.svc.cluster.local) — GARM server stays in its namespace
- Secrets (sizer-tokens, sizer-oidc-client, garm-fixed-credentials) must exist in ci-sizer namespace
2026-04-29 10:16:45 +02:00
54dfd0831d
chore: ⬆️ bump garm image to v0.1.7-forgejo-22 2026-04-28 10:11:09 +02:00
44fc9ace56
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-21
Fix orphaned runner pods — instance name mismatch resolved.
2026-04-24 15:47:15 +02:00
2185d7962a
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-20
Remove activeDeadlineSeconds — was killing legitimate long CI jobs.
2026-04-24 14:51:43 +02:00
864494ffad
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-19
Includes provider v2.0.41 with ci-sizer v0.0.71:
- Fix workflow/job detail view showing all historical runs
- CI workflow fixes (Forgejo action URLs, SBOM skip)
- REUSE compliance
2026-04-24 13:41:50 +02:00
cbf8ab891d
chore: bump garm image to v0.1.7-forgejo-18 2026-04-22 13:19:10 +02:00
cfacb67789
chore: bump garm to v0.1.7-forgejo-17 (activeDeadlineSeconds) 2026-04-21 17:15:26 +02:00
ec1c1bec74
chore(garm): ⬆️ bump garm-helm to v0.0.15 (startup probe fix) 2026-04-21 16:27:25 +02:00
db76e7a517
chore(garm): ⬆️ bump garm-helm chart to v0.0.14 2026-04-21 16:03:06 +02:00
3c5c9ecbbc
chore: bump garm image to v0.1.7-forgejo-16 2026-04-21 15:53:40 +02:00
0dbd286615
chore(sizer): 🔧 rename forgejo-runner-sizer to ci-sizer in deployment configs
- Update container image names to ci-sizer-{receiver,collector}
- Update Dex OIDC client ID and name to ci-sizer
- Template allowed-org as SIZER_ALLOWED_ORG variable
2026-04-21 14:16:38 +02:00
1b3bb0061e
feat(ci-sizer): 🚀Added ci-sizer subdomain to sizer-receiver
Ref: IPCEICIS-8516
2026-04-21 14:06:52 +02:00
336796995d
chore(garm): ⬆️ bump garm to v0.1.7-forgejo-15 2026-04-20 17:31:44 +02:00
50c62b2ce0
chore(garm): ⬆️ bump garm to v0.1.7-forgejo-14, add sizing policy env vars 2026-04-20 16:08:27 +02:00
a2c635ae6e
fix(garm): 🔧 sync sizer-receiver template with production config and bump garm tag to v0.1.7-forgejo-13 2026-04-16 15:11:54 +02:00
Martin McCaffery
7bf72a39d8
Enforce MFA for all admin users 2026-03-17 14:06:06 +01:00
Martin McCaffery
7eed0cd5f8
Rename optimiser to sizer 2026-03-10 10:08:11 +01:00
fb7c64ab2f
refactor: Rename optimiser-receiver to sizer-receiver and update related configurations 2026-03-06 14:03:03 +01:00
martin.mccaffery
1de5edd974 Pin GARM image version 2026-03-04 16:43:30 +00:00
Martin McCaffery
426b8cd5b2
Update garm-helm version to v0.0.7 2026-03-04 17:04:53 +01:00
d522461bc1
chore(config): ⬆️ Bump Forgejo Helm chart to v16.2.0
Updates the Helm chart version to incorporate the latest features,
improvements, and bug fixes from upstream. Ensures deployment uses a
more recent and supported release.
2026-03-03 17:29:34 +01:00
aa8ab8c63f
chore(core): ⬆️ Bump Argo CD version to 9.4.6
Updates the Argo CD deployment to use a newer version, improving compatibility and potentially resolving issues tied to older releases.

Relates to ongoing maintenance and upstream bug tracking.
2026-03-03 11:52:40 +01:00
Martin McCaffery
b36613ae87
Point Garm to new fixed-credentials secret 2026-02-26 14:17:28 +01:00
Martin McCaffery
9bff9bd628
Add new images for static forgejo runners 2026-02-17 13:28:02 +01:00
Martin McCaffery
cd49aadaa5
Fix argocd: stop cloudnative-pg creating too-long annotation 2026-02-17 09:48:31 +01:00
Martin McCaffery
ceca7d4e82
Add ingress for optimiser-receiver 2026-02-13 09:30:12 +01:00
Martin McCaffery
f7bab3b2c6
Add optimiser deployment to garm stack 2026-02-12 16:33:21 +01:00
Martin McCaffery
19b1c120e2
Add empty stacks file for cloudnative-pg 2026-01-30 11:47:22 +01:00
Martin McCaffery
fa93ba9163
Add more provider config to GARM helm values 2026-01-30 11:24:55 +01:00
Martin McCaffery
7eb0cdff9d
Re-enable dex 2026-01-29 15:57:06 +01:00