Ed25519 signing-key rotation
Emergency-action tokens (playbook runs, isolation commands, agent self-update wrappers) are Ed25519-signed by the hub. The agent verifies against a pinned public key. For zero-downtime rotation:
The problem
Naive rotation = new key + all agents immediately verify against the new key → in-flight tokens signed with the old key suddenly become invalid. On compromise you WANT that (instant revocation), but for a planned rotation you want grace.
How it works
The hub keeps a set of hub_signing_keys per tenant. Every key has is_active=true, optionally expires_at. On rotation:
- Generate a new Ed25519 keypair
- Public part
INSERTintohub_signing_keyswithis_active=true,expires_at=NULL - Existing active keys get
expires_at = NOW() + grace_days × INTERVAL '1 day' - The private key is shown ONCE in the response, then never again
The agent periodically fetches GET /api/v1/agents/:id/signing-keys/active (only non-expired actives). During grace the agent therefore holds BOTH keys in its trust set; tokens signed with either the old or the new key validate.
After expires_at of the old key, it falls out of the trust set automatically.
How you do it in the UI
/settings → Signing keys tab:
- Give a reason (“annual rotation”, “suspected compromise”, …)
- Pick grace days (default 7, max 90)
- Click Rotate now
- Copy the shown private_hex immediately — not stored, shown once
- Paste it into the hub deployment config (env var
MONSYS_EMERGENCY_PRIVATE_KEY_HEXor secrets manager) - Restart the hub. From that moment the hub signs with the new key; old tokens stay valid for
grace_days.
The table below the rotate button shows all keys: ACTIVE (no expiry), EXPIRES <date> (in grace), or RETIRED.
Compromise scenario
On suspected compromise:
- Rotate with
grace_days=0(shortens grace to ~now) - Paste the new private key into the deploy
- Restart the hub
- All old tokens become invalid immediately
This triggers: in-flight playbook runs not yet received by the agent can fail. So use ONLY for REAL compromise — not for planned maintenance.
API
GET /api/v1/signing-keys (admin only)POST /api/v1/signing-keys/rotate (admin, rate-limit 5/h)GET /api/v1/agents/:id/signing-keys/active (agent-auth)Body POST /rotate:
{ "reason": "annual rotation 2026", "grace_days": 7}Response (ONE TIME):
{ "id": "uuid", "public_hex": "abc…64", "private_hex": "def…128", "expires_grace_days": 7, "expires_at": "2026-05-17T...Z", "warning": "Save the private key now — it is shown only this once."}Audit
Every rotation logs to audit_log with resource_type='signing_key', resource_id=<new_key_id>, IP, user, reason. Reviewable via /audit?resource_type=signing_key.