Runbook — JWT_SECRET rotation¶
Plan step: A2.5.
Owner: backend lead.
Stated cadence: 90 days, or immediately on suspected leak
(docs/security/PROTOCOL.md §1).
Blast radius: every issued JWT becomes invalid the moment a new
secret takes effect. Every authenticated user must re-login. There
is no per-user staggered rollout — JWT_SECRET is a single
process-level signing key.
This runbook is the comms-coordinated playbook the trust portal
roadmap on cryptography.md and session-management.md references.
It does not trigger a rotation; it tells the operator what a
rotation looks like when one is decided.
When to rotate¶
Scheduled rotations (90 days)¶
The default cadence. Forced re-login is a small cost on a 90-day calendar; pushing further than 90 days starts to compound the value of a compromised secret if a future leak is discovered late.
Schedule the next scheduled rotation before finishing the current
one — set a calendar reminder for last_rotation_date + 90 days so
the cadence does not silently slip.
Emergency rotations (immediate)¶
Rotate without waiting for the cadence if any of the following holds:
- A
JWT_SECRETvalue (current or any prior version) was committed to a repo, written to a log, posted in chat, sent in an email, or in any way exited the Secret Manager + on-VM-process boundary. - A VM-side compromise is suspected (regardless of whether exfiltration is confirmed).
- A backend developer with current access to Secret Manager leaves the company.
- The secret was visible in a screen-share, demo recording, or third-party support session, even momentarily.
Emergency rotations do not require pre-announcement; the user impact (forced re-login) is the same and is preferable to leaving the suspected-compromised secret in place.
Pre-flight¶
Run all of the following before issuing the new secret version:
- Confirm the trigger. For scheduled rotation: today is
last_rotation_date + ≥ 90 days. For emergency: identify the leak vector, even if uncertain. - Identify on-call. A second engineer must be reachable for the next 30 minutes — rotation hits every authenticated user simultaneously.
- Pick the window. Scheduled rotations land Sunday 06:00–08:00 Europe/Brussels by convention — minimum traffic, support team on standby, US/EU/IL operators all reach the same morning. Emergency rotations override the window.
- Write the user-facing notice. A pre-baked notice template is below. For scheduled rotation, send it 48 hours and again 1 hour before the window. Emergency rotation: send during or immediately after.
- Verify the local stack is healthy and on the current secret
version:
gcloud secrets versions list claim-guard-jwt-secret --project=train-cvit2shows the currentlatest. - Confirm Secret Manager access. The operator running the
rotation must have
roles/secretmanager.secretVersionAdderor higher onclaim-guard-jwt-secret. The VM service account (claim-guard-vm@…) keeps its existingsecretAccessorrole — no IAM change at rotation time. - Generate the new secret value outside shell history. Recommended: Length floor: 32 bytes of entropy. The current secret is 48 bytes base64 (~64 chars); preserve that shape so log parsers and length-tagged tooling don't notice the change.
Procedure¶
All commands run from a workstation with gcloud auth login against
roee@dtectvision.ai and gcloud config set project train-cvit2.
# 1. Add a new secret version. Does NOT yet take effect anywhere
# because the server fetches `versions/latest` only at boot.
printf '%s' "$NEW_SECRET" | gcloud secrets versions add \
claim-guard-jwt-secret \
--data-file=- \
--project=train-cvit2
# 2. Verify the new version is at the head.
gcloud secrets versions list claim-guard-jwt-secret \
--project=train-cvit2 \
--limit=3
# Expected: the just-added version is `ENABLED` and at the top.
# 3. Restart pm2 on the production VM so the new value is read on
# boot via server/src/lib/secrets.js. Do this OVER IAP-SSH only.
gcloud compute ssh claim-guard-app-1 \
--tunnel-through-iap \
--zone=europe-west1-b \
--command='pm2 restart all && pm2 logs --lines 50 --nostream'
# Expected pm2 output: clean boot, no JWT_SECRET error, no
# Secret Manager error, no DATABASE_URL error.
# 4. Smoke-test from outside the cloud project.
curl -fsS https://app.dtectvision.ai/api/health
# Expected: HTTP 200, body { status: "ok", ... }.
# (If A1.3 has not yet landed, hit the current public IP instead.)
After step 3, every previously issued JWT is now invalid. Users
will see 401 with code: TOKEN_EXPIRED (jsonwebtoken returns this
on signature mismatch when the old token's exp has not yet been
reached) or Invalid token. They must log in again to receive a
JWT signed with the new secret.
Verification¶
| Check | Command | Expected |
|---|---|---|
New secret version is latest |
gcloud secrets versions list claim-guard-jwt-secret --project=train-cvit2 --limit=1 |
the version added in step 1 is ENABLED |
| Server is using the new secret | pm2 logs --lines 200 on the VM (over IAP-SSH) |
no JWT_SECRET environment variable is required error; no Secret Manager errors |
| Existing tokens are rejected | from a separate session: curl -H "Authorization: Bearer <jwt-issued-before-rotation>" https://app.dtectvision.ai/api/auth/me |
HTTP 401 |
| New login succeeds | curl -X POST -d '{"email":"…","password":"…"}' -H "Content-Type: application/json" https://app.dtectvision.ai/api/auth/login |
HTTP 200, response includes a token |
| New token works | curl -H "Authorization: Bearer <new-jwt>" https://app.dtectvision.ai/api/auth/me |
HTTP 200, returns the user record |
Recovery¶
If pm2 fails to boot after the rotation (e.g. Secret Manager throws, or the new secret was generated with a non-printable character that broke the env-var pipeline), the working escape hatch is:
# On the VM, over IAP-SSH:
# 1. Disable the broken version.
gcloud secrets versions disable <new-version-id> \
--secret=claim-guard-jwt-secret \
--project=train-cvit2
# 2. Restart so the previous version is now `latest`.
pm2 restart all
pm2 logs --lines 100 --nostream
The previous version remains ENABLED automatically, so disabling
the new one re-elevates it. Users with tokens issued under the
original secret will still be rejected — pm2 restart already
booted them out once. They must re-login one more time once the
recovery is done. There is no version of "rollback that preserves
sessions" — a JWT signing-key rotation cannot be transparent.
If the issue is "new secret is fine, the rotation still went bad
somewhere," do not delete versions. disable is reversible;
destroy is not.
Post-rotation¶
- Update the calendar reminder. Next scheduled rotation = today + 90 days.
- Add a journal entry to the top of
docs/security/HARDENING-LOG.mdunder today's date. Include: trigger (scheduled vs. emergency + reason), version IDs (old → new), pm2 restart timestamp, the smoke-test response, and the user-impact window. Use the heading### A2.5 — JWT_SECRET rotation [done <date>]so future runbook iterations can grep for it. - After 7 days, destroy old versions that are no longer the
most recent two:
Keep
gcloud secrets versions destroy <old-version-id> \ --secret=claim-guard-jwt-secret --project=train-cvit2latestandlatest - 1enabled as the recovery floor. Older versions are not consulted by the server (it always readsversions/latest) and only widen the leak surface if Secret Manager itself were ever compromised.
User-facing notice templates¶
Scheduled (T-48h and T-1h)¶
Scheduled maintenance — Sunday \<DATE> 06:00–08:00 CET
ClaimGuard will perform a scheduled security maintenance window at the time above. You will be logged out as part of the maintenance and asked to log in again. No user action is required in advance; bookmarked links and saved data are unaffected. If you have trouble logging in after \<DATE 09:00 CET>, please contact support at security@dtectvision.ai.
Emergency (during / immediately after)¶
You have been logged out as part of an unscheduled security action. Your data is unaffected. Please log in again. If you have any concerns or questions, contact security@dtectvision.ai.
Do not include the words "rotation," "compromise," or "JWT" in the user-facing copy. They invite questions you can answer in a follow-up if asked, but should not lead the message.
Cross-references¶
docs/security/PROTOCOL.md§1 Secrets — 90-day cadence rule.- Cryptography — public-facing claim about rotation.
- Session management — public-facing claim about the rotation's user impact.
server/src/lib/secrets.js— Secret Manager bootstrap. Readsversions/latestat boot.server/src/middleware/auth.js—JWT_SECRETis mandatory at boot; signature verification happens here.docs/security/HARDENING-LOG.md— the journal to which each completed rotation is recorded.