Commit graph

89 commits

Author SHA1 Message Date
57ee5afa62
feat(observability): add VMServiceScrapes + migrate VLogs → VLSingle
- Migrate VLogs CRD to VLSingle (operator.victoriametrics.com/v1beta1)
- Add VMServiceScrape for Forgejo (gitea ns, port http, /metrics)
- Add VMServiceScrape for ArgoCD (argocd ns, port http-metrics)
- Add VMServiceScrape for GARM (garm ns, port metrics)
- Add VMServiceScrape for CoreDNS (kube-system ns, k8s-app: kube-dns)

Ref: IPCEICIS-4618, IPCEICIS-5066
2026-06-15 21:05:22 +02:00
7949cabb29
fix(garm): ⬆️ update to v0.1.7-forgejo-24 (fresh multi-arch build)
Build completed successfully. Fixes exec format error from -23.
Dropped stale NOTE warning — image is clean amd64.
2026-06-12 13:42:23 +02:00
8939b4f32b
fix(secrets-backup): 🔄 sync simplified manifest from template
Remove client-side openssl encryption. OBS SSE-KMS handles encryption at rest.
Updated: no apk add openssl, no openssl enc step, no secrets-backup-config Secret,
upload .tar.gz directly. Image tag bumped to 1.0.1 (built without openssl).

Ref: IPCEICIS-9317
2026-06-12 13:12:20 +02:00
900c1f6c80
fix(dev): 🐛 revert automated-upload damage — restore working image pins + OIDC secrets
Automated upload (95deeef) overwrote 5 manually-pinned values:

- forgejo-server: restore workflow-webhook-20260305 (DB has v15a/v15b
  migrations; rolling back to 14.0.2-edp1-rootless WILL break the DB)
- garm: restore v0.1.7-forgejo-22 (v0.1.7-forgejo-23 has exec format
  error — wrong arch build, crashes on OTC CCE amd64 nodes)
- sizer-receiver/secret.yaml: re-add sizer-oidc-client secret (deleted
  by upload; causes OIDC auth failure on every sizer-receiver login)
- dex/manifests/dex-sizer-client.yaml: re-add (deleted by upload;
  dex cannot resolve sizer OIDC client without this secret)
- dex.yaml: restore manifests source block (removed by upload;
  without it ArgoCD never deploys the dex/manifests/ directory)

backup-alerts.yaml (new VMRule from automated upload) is kept as-is.
2026-06-12 10:11:00 +02:00
Automated pipeline
95deeef6a0 Automated upload for dev.t09.de 2026-06-12 07:46:00 +00:00
9bbcf4efca
fix(secrets-backup): 🐛 add openssl install + upgrade image to 1.32.0
alpine/k8s:1.28.0 does not ship openssl. Script calls openssl enc
on line 116 causing exit 127 on every run since initial deploy.

Fix:
- apk add --no-cache openssl at script start (defensive, idempotent)
- upgrade image 1.28.0 -> 1.32.0 (kubectl client was 5 minor versions
  behind cluster v1.33, outside supported skew of +/-1)
2026-06-12 09:32:48 +02:00
cf8271fd86
revert(ci-sizer): 🔥 revert image pin — no versioned images in registry
GoReleaser config uses 'dockers_v2' (invalid key, should be 'dockers')
so versioned container images were never pushed. Only :latest exists.
Reverting to :latest until CI pipeline is fixed to publish version tags.

Refs: IPCEICIS-9326
2026-06-08 18:12:56 +02:00
f4aa470894
fix(ci-sizer): 📌 pin sizer-receiver to v0.8.1 for dev
v0.8.2 does not exist — tags go v0.8.1 → v0.8.3.
v0.8.3 introduced RequireOrgMatch middleware that breaks dev env where
repos are under giteaAdmin but OIDC org resolves differently.
Pin to v0.8.1 until IPCEICIS-9326 fixes multi-env org support.
2026-06-08 18:08:04 +02:00
3fdfda9da7
fix(ci-sizer): 📌 pin sizer-receiver to v0.8.2 for dev
v0.8.3 introduced RequireOrgMatch middleware that breaks dev env where
repos are under giteaAdmin but OIDC org resolves differently.
Pin to v0.8.2 until IPCEICIS-9326 fixes multi-env org support.
2026-06-08 18:06:00 +02:00
69839f767b
fix(ci-sizer): 🐛 set RECEIVER_ALLOWED_ORG=giteaAdmin for dev
Dev Forgejo repos live under giteaAdmin user, not DevFW org.
Prod will use DevFW-CICD (template default). Dev needs explicit override.
2026-06-08 18:00:47 +02:00
925c7416b3
fix(ci-sizer): 🐛 revert RECEIVER_ALLOWED_ORG to DevFW for dev env
Template default is DevFW-CICD (prod), but dev Forgejo uses DevFW org.
Hydration overwrote the correct value today.
2026-06-08 17:51:14 +02:00
bd82384eb1
fix(dex): 🔐 correct sizer client secret to match sizer-oidc-client
The deploy hydration created dex-sizer-client with wrong value.
Reverting to the original shared secret that sizer expects
(73eda906... - active for 81 days before hydration overwrote it).

Changes:
- sizer-oidc-client: restore correct shared secret
- dex-sizer-client: add managed manifest to prevent future drift
- dex.yaml: add manifests source for ArgoCD to sync the secret

Broken by stacks rehydration pipeline run.
2026-06-08 17:11:10 +02:00
967edf0382
fix(ci-sizer): 🔐 align OIDC client secret with dex config
Secret mismatch caused infinite login loop on sizer.dev.t09.de.
Added sizer-oidc-client secret manifest to GitOps so ArgoCD manages it.
Value now matches dex-runner-sizer-client (dex side).
2026-06-08 17:00:38 +02:00
Daniel.Sy
9a7544418c fix(forgejo): 🐛 use workflow-webhook image matching DB migration level (v15a/v15b)
DB was migrated to v15 schema by this image in March.
The 14.0.2-edp1-rootless image cannot start against it.
Today's automated pipeline sync triggered pod restart, exposing the mismatch.
2026-06-08 14:11:31 +00:00
Daniel.Sy
a047be3aae fix(garm): ⬇️ rollback to v0.1.7-forgejo-22 — -23 has exec format error (wrong arch) 2026-06-08 14:11:05 +00:00
Automated pipeline
422f568c8e Automated upload for dev.t09.de 2026-06-08 12:15:27 +00:00
Martin McCaffery
14873b7941
fix(garm): bump dev+benchmark to garm-helm v0.0.17 (template-robust readToken); drop now-redundant explicit fields on benchmark 2026-06-02 16:21:51 +01:00
a7bc25603c
Added DevFW-CICD users as admins 2026-05-19 14:01:18 +02:00
75e4a2384b
fix(ci-sizer): 🐛 align GARM_URL with template output
Use short service DNS (garm.garm.svc:80) instead of FQDN
(garm.garm.svc.cluster.local:80) to match what the stack template
now generates.

Ref: IPCEICIS-6886
2026-05-18 10:26:23 +02:00
manuel.ganter
c5191ea18a Update otc/dev.t09.de/stacks/forgejo/forgejo-server/manifests/forgejo-ingress.yaml 2026-05-05 08:29:25 +00:00
2e90240c81
refactor(stacks-instances): 🚚 migrate sizer-receiver to ci-sizer namespace (dev.t09.de)
Move sizer-receiver from stacks/garm/ to stacks/ci-sizer/ for
dev.t09.de only. edp.buildth.ing stays in garm (not deployed yet).
2026-04-29 10:36:14 +02:00
9d042eee1c
chore: ⬆️ bump garm image to v0.1.7-forgejo-22 on dev.t09.de 2026-04-28 10:11:09 +02:00
bc96d8d7aa
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-21 2026-04-24 15:47:23 +02:00
e65abf162e
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-20 2026-04-24 14:51:58 +02:00
a9dcf29f7a
chore(garm): ⬆️ bump garm-forgejo to v0.1.7-forgejo-19 2026-04-24 13:41:52 +02:00
b72e2049e3
chore: bump garm image to v0.1.7-forgejo-18 for dev.t09.de 2026-04-22 13:19:32 +02:00
4cea4ffde7
chore: bump garm to v0.1.7-forgejo-17 (activeDeadlineSeconds) 2026-04-21 17:15:41 +02:00
0b13b89640
chore(garm): ⬆️ bump garm-helm to v0.0.15 (startup probe fix) 2026-04-21 16:27:25 +02:00
4aa8973c91
chore(garm): ⬆️ bump garm-helm chart to v0.0.14 2026-04-21 16:03:06 +02:00
c682c48be0
chore: bump garm image to v0.1.7-forgejo-16 2026-04-21 15:53:50 +02:00
61721097d6
chore(sizer): 🔧 rename forgejo-runner-sizer to ci-sizer in deployment configs
- Update container image names to ci-sizer-{receiver,collector}
- Update Dex OIDC client ID and name to ci-sizer
- Template allowed-org as SIZER_ALLOWED_ORG variable
2026-04-21 14:16:39 +02:00
487e1ac15c
chore(garm): ⬆️ bump garm to v0.1.7-forgejo-15 2026-04-20 17:32:22 +02:00
2af607e949
chore(garm): ⬆️ bump garm to v0.1.7-forgejo-14, add CPU sizing mode env vars 2026-04-20 16:08:12 +02:00
f2c885cd84
fix(sizer): 🔧 sync gitops with live deployment — add OIDC config, remove legacy Forgejo tokens 2026-04-16 15:05:53 +02:00
08740eb1da
chore: bump garm image to v0.1.7-forgejo-13 (RunNumber enrichment via WebSocket) 2026-04-16 13:32:12 +02:00
47f99082db
feat(sizer-receiver): add GARM WebSocket event enrichment env vars
Add GARM_URL, GARM_USER, and GARM_PASSWORD environment variables to
the sizer-receiver deployment so it can connect to GARM's WebSocket
event stream for run-status enrichment.

Ref: IPCEICIS-8514
2026-04-15 15:46:55 +02:00
a3bae88ce9
fix(sizer-receiver): 🐛 add fsGroup to pod securityContext for PVC write access
Distroless nonroot container (UID 65534) needs matching fsGroup to write
to the PVC used for SQLite migrations.

Ref: IPCEICIS-8514
2026-04-15 14:45:27 +02:00
9374d90d1f
chore(garm): ⬆️ bump image to v0.1.7-forgejo-12 (ParseExtraSpecs fix)
Pick up double-encoding fix from garm-provider-edge-connect v2.0.30.

Ref: IPCEICIS-8514
2026-04-15 13:50:54 +02:00
e0f74e9ec4
chore(garm): ⬆️ bump image to v0.1.7-forgejo-11 with fixed provider binary
Ref: IPCEICIS-8514
2026-04-15 12:25:37 +02:00
58c694c9d1
chore(garm): 📦 bump image to v0.1.7-forgejo-10 (GitHub Actions cgroup fix)
Provider v2.0.27 fixes CIProvider-aware CGROUP_PROCESS_MAP for GitHub
Actions runner detection, completing multi-provider support.

Ref: IPCEICIS-8514
2026-04-15 10:23:57 +02:00
d1ab2f6c85
chore(garm): 📦 bump image to v0.1.7-forgejo-9 (multi-provider support)
garm-provider-edge-connect v2.0.26 adds GitHub Actions + Forgejo multi-provider support.
2026-04-14 16:58:24 +02:00
246be79659
chore(garm): bump to v0.1.7-forgejo-8 (revert buildkitd wrapper) 2026-04-14 13:01:17 +02:00
6f9a6372f1
chore(garm): bump garm image to v0.1.7-forgejo-7
- Includes provider v2.0.24 with pod cleanup fixes:
  - GetPod returns terminal pods for proper GARM lifecycle
  - ListInstances prefix mismatch fixed
  - ProviderID consistency fix
  - buildkitd SIGTERM graceful shutdown
2026-04-14 11:23:53 +02:00
d116313afe
chore(garm): bump to v0.1.7-forgejo-6 (provider nil map fix) 2026-04-13 18:02:37 +02:00
ee8b2f0e9c
chore(garm): bump helm chart to v0.0.13 for nodes RBAC 2026-04-13 16:35:44 +02:00
dedebf1747
chore(garm): update image to v0.1.7-forgejo-5 and add pending_timeout config 2026-04-13 15:23:48 +02:00
46a1c1aa33
feat(dex): add forgejo-runner-sizer OIDC static client
Register forgejo-runner-sizer as a Dex static client for OIDC
authentication on sizer.dev.t09.de. Adds the client secret env var
injection and the staticClients entry with secretEnv reference.
2026-04-10 13:22:45 +02:00
Automated pipeline
4b11db5668
Automated upload for dev.t09.de 2026-03-17 14:16:23 +01:00
54dc4302f9
bump garm due to bootstrap params 2026-03-12 17:50:20 +01:00
3f739528ae
Set deployment strategy to Recreate for sizer-receiver 2026-03-12 10:35:13 +01:00