MSP — cross-tenant triage and delivery

1. Monday morning cross-tenant triage

In the dashboard

Login as MSP admin → Sidebar → MSP cockpit
All tenants sorted by urgency descending
Click top urgency tenant → opens in tenant context
Filter ‘urgency > 50’ to isolate your morning’s work

/msp/cockpit — one page, all tenants you manage, sorted by urgency composite:

Or via API (advanced — for automation)

urgency = (open_critical × 10)
        + (open_high     × 3)
        + (sla_breach_minutes / 60)
        + (kev_open_cves)
        + (trust_score_delta_24h × -2)   -- a drop is urgent
        + (overdue_eats × 5)

Workflow:

Top 3 tenants by urgency → click through → fix
Per tenant you see a mini Trust Score (current + 24h delta) + top open alert + top open kernel CVE
One click “Open in tenant context” uses RBAC impersonation (see §3) — no per-tenant re-login

Filter: urgency > 50 typically shows 2-5 tenants. The rest sit below 50 and can wait until this week — not today.

2. Tenant handover report — what we did this month

In the dashboard

Sidebar → Audit Packs → pick the client’s month
Click ‘Download for handover’ → filter actor LIKE ’%@yourMSP.com’
Resulting PDF + .sig are signed by the hub
Email to client — they verify offline with monsys-verify-eat CLI

The client asks every month: “what did your team do for me?”

The Monthly Audit Pack already covers this:

2026-04.jsonl.gz — all EATs executed on behalf of the client
2026-04.pdf — aggregated: Trust Score evolution, kernel updates executed, CVEs fixed, alerts handled, sessions opened

MSP-specific: filter the PDF to only actions by your own team (u.email LIKE '%@yourMSP.com'):

Or via API (advanced — for automation)

curl 'https://app.monsys.ai/api/v1/audit-packs/<pack_id>/download?format=pdf&actor_filter=@yourMSP.com' \
  -H "Authorization: Bearer $TOKEN" -o handover-acme-2026-04.pdf

Send this PDF + .sig directly to the client. They verify themselves that it came from monsys runtime, not from your editor:

./monsys-verify-eat-linux-x64 verify-pack \
  --pack handover-acme-2026-04.pdf  \
  --sig  handover-acme-2026-04.sig  \
  --pubkey  https://transparency.monsys.ai/pubkeys/hub.pub

3. White-label branding per tenant

In the dashboard

Sidebar → Settings → Tenant branding (per client)
Upload logo + color + product name + custom_domain
Client opens custom_domain → sees own brand, your MSP as ‘powered by’
RBAC scope stays your management; client is read-only

Since schema 31 per-tenant branding sits in the hub. The client portal shows their logo, colour, domain — not your monsys.ai brand.

Or via API (advanced — for automation)

curl -X PUT https://app.monsys.ai/api/v1/tenants/<id>/branding \
  -H "Authorization: Bearer $MSP_ADMIN_TOKEN" \
  -F 'logo=@acme-logo.svg' \
  -F 'primary_color=#1e3a8a' \
  -F 'product_name=AcmeOps' \
  -F 'custom_domain=ops.acme.com'

Client opens ops.acme.com → sees AcmeOps brand, your MSP credited in the footer as “powered by”. Client cannot do admin actions themselves (that’s your RBAC scope), but they can see read-only what happens + download their own audit evidence.

4. Auto-group agents per tenant via tag

In the dashboard

Sidebar → Groups → ‘New group’ (in tenant context)
Rule: all_of [tag=production, tag=eu-west-1]
Add runbook markdown in ‘Runbook’ field
GroupMembershipWorker (5min tick) updates membership automatically

A client has 40 hosts spread across dev/staging/prod. Managing static groups = maintenance burden. Dynamic groups via tag rule:

Or via API (advanced — for automation)

curl -X POST https://app.monsys.ai/api/v1/groups \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "tenant_id": "<acme_uuid>",
    "name":      "production-eu",
    "rule":      {
      "all_of": [
        {"tag": "production"},
        {"tag": "eu-west-1"}
      ]
    },
    "runbook_md": "# Production EU runbook\n\n…"
  }'

GroupMembershipWorker (every 5 min) hashes the set and updates membership. A new host registering with production,eu-west-1 tags automatically lands in this group + inherits the runbook + SLA + on-call rotation.

5. Pre-issued EATs for off-hours emergency

In the dashboard

Sidebar → Playbooks → pick ‘Isolate network’
‘Pre-issue for agent’ button → pick host + valid window
Condition (heartbeat lost / critical alert) + TOTP
Agent receives EAT via WS, activates itself when condition matches

Problem: client has a 2am incident. Your on-call engineer is awake but must first authenticate to the hub + TOTP + issue an Ed25519 EAT. That is 5 extra minutes when seconds matter.

Solution (mig 091): pre-issued playbook EATs — issued to a specific agent with a short TTL AND a condition that only on-call context can trigger.

Or via API (advanced — for automation)

curl -X POST https://app.monsys.ai/api/v1/agents/<id>/pre-issued-eats \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-TOTP-Code: 123456" \
  -d '{
    "playbook_id":   "<isolate-network-playbook_id>",
    "valid_from":    "2026-05-19T18:00:00Z",
    "valid_until":   "2026-05-20T08:00:00Z",
    "conditions": {
      "heartbeat_lost_minutes": 5,
      "or_severity_critical":   true
    },
    "reason": "After-hours coverage for ACME — Saturday night"
  }'

During the window, if the agent detects it hasn’t been able to send a heartbeat for >5 minutes OR a critical alert is open, it can execute the pre-issued EAT itself (once, single-use nonce consumed). Audit evidence is identical to a normal EAT.

Use cases:

Network isolation when a ransomware pattern is detected
Restart of a specific app without operator input
Quarantine of a suspicious file

6. Multi-party signing for irreversible actions

In the dashboard

Sidebar → Emergency → ‘New Level 3 EAT’
Enter action + reason + required_approvers = 2
Other admins get push on mobile PWA with ‘Approve’/‘Reject’
On N approvals (each their own TOTP, split-control) → EAT fires

Some actions are so destructive that single TOTP isn’t enough (production DB restore, CEO laptop kernel update, fleet-wide secrets rotation). Level 3 EATs require quorum.

Or via API (advanced — for automation)

curl -X POST https://app.monsys.ai/api/v1/emergency/quorum \
  -H "Authorization: Bearer $TOKEN" \
  -H "X-TOTP-Code: 123456" \
  -d '{
    "agent_id": "<id>",
    "actions": [{ "kind": "run_playbook", "id": "db-restore" }],
    "reason": "Restore from 2026-04-15 snapshot per ticket TKT-9001",
    "required_approvers": 2
  }'

Hub sends ntfy notification to every other MSP engineer with admin role. The mobile PWA shows “Pending approval — DB restore on ACME”:

ACME / db-prod-01
RunPlaybook: db-restore
Requested by alice@yourMSP.com at 14:23
Reason: Restore from 2026-04-15 snapshot per ticket TKT-9001
[Approve with TOTP] [Reject]

Only when N approvals are in AND each via a different TOTP flow (split-control: no single engineer can approve twice) does the EAT fire. Quorum proof lands in audit_log:

SELECT event_type, event_data
  FROM audit_log
 WHERE event_type = 'emergency_quorum_approved'
   AND event_data->>'nonce' = '<nonce>';

The tenant sees in /audit-packs PDF that the action was executed via 2-of-2 quorum — strong evidence for SOC2 separation of duties.

7. MSP billing — aggregate all tenants into one invoice

In the dashboard

Sidebar → Billing → ‘Cross-tenant overview’ tab
Table: active agents per tenant + billable (after 5 free)
Total monthly_eur per client
‘Export for invoice’ button → CSV per month

For clients where you pay the bill (then rebill them):

Or via API (advanced — for automation)

WITH per_tenant AS (
  SELECT t.id, t.name,
         COUNT(*) FILTER (WHERE a.is_active=true) AS active_agents,
         COUNT(*) FILTER (WHERE a.is_active=true) - 5 AS billable
    FROM tenants t
    JOIN agents a ON a.tenant_id = t.id
   WHERE t.msp_owner = $1::UUID
     AND t.created_at < date_trunc('month', NOW())
   GROUP BY t.id
)
SELECT name,
       active_agents,
       GREATEST(billable, 0)             AS billable_agents,
       GREATEST(billable, 0) * 3.0       AS monthly_eur
  FROM per_tenant
 ORDER BY monthly_eur DESC;

First 5 agents per tenant are free (per tenant, not per MSP). The msp_owner column on tenants is the relationship that ties your MSP role.

8. RBAC impersonation for cross-tenant support

In the dashboard

Tenant switcher top-right → pick client + ‘Impersonate’
Enter reason + duration 60min + TOTP
Work in client context — actions get dual actor in audit_log
Client sees impersonation_started event in own Audit Pack

Engineer Alice (MSP admin) wants to take an action in ACME’s context. Instead of a fresh login: impersonate.

Or via API (advanced — for automation)

curl -X POST https://app.monsys.ai/api/v1/auth/impersonate \
  -H "Authorization: Bearer $MSP_ADMIN_TOKEN" \
  -H "X-TOTP-Code: 123456" \
  -d '{
    "tenant_id": "<acme_uuid>",
    "duration_minutes": 60,
    "reason": "Investigating alert #847 on web-03"
  }'
# returns scoped token

The resulting token has tenant=ACME and all queries run in ACME’s RLS context. Audit log shows impersonation_started + all actions with a dual actor: your user_id (your email) AND tenant=ACME. The client sees in their own audit pack that the MSP impersonated — no surprise audit trails.

Auto-expire after 60 min. End early via POST /api/v1/auth/impersonate/end.