Data model — what we keep and don't
copilot_seats — active seats
Each pull cycle replaces the entire snapshot for that org.
| Field | Source | Hashed? |
|---|---|---|
user_login_hash | SHA256(lowercase login) | ✓ |
user_login_prefix | First 4 chars + ’…‘ | partial |
plan_type | ”business” / “enterprise” | no |
assignee_team | Name of the team that assigned the seat | no |
last_activity_at | RFC3339 timestamp | no |
last_activity_editor | ”VSCode” / “JetBrains” / … | no |
pending_cancellation_date | DATE or NULL | no |
snapshot_at | When we pulled | no |
copilot_events — audit-log entries
Append-only. Idempotent on (connection_id, github_event_id).
| Field | Source | Hashed? |
|---|---|---|
github_event_id | GitHub’s _document_id | no |
event_type | action field | no |
actor_login_hash | SHA256(lowercase actor.login) | ✓ |
actor_login_prefix | First 4 chars + ’…‘ | partial |
target_login_hash | SHA256(lowercase user.login) | ✓ |
target_login_prefix | First 4 chars + ’…‘ | partial |
occurred_at | @timestamp field | no |
payload_json | Rest of GitHub’s payload, with PII filtered | partial |
Filtered payload fields
These fields from GitHub audit-log entries are dropped before
storage (see isPIIField in hub/api/handlers/copilot_worker.go):
actor,user,actor_login,user_login— already stored as hashesactor_id,user_id— internal GitHub user IDsactor_email,user_email,emailsname,full_name— display names
Everything else (action codes, org metadata, repository names, business
names) goes into payload_json unchanged.
What we DO NOT have
- Full username, email, display name — dropped at ingest
- Code suggestions, prompts, completions — Copilot keeps those, we have no access
- IP addresses — not in GitHub’s admin API
- Files a dev had open — private to the dev
- Per-dev usage frequency/duration — only
last_activity_at; GitHub doesn’t expose granular timing
How to map a hash back to a person
The user_login_prefix shows jan… for user jan.peeters. Match this
prefix against your HR system or GitHub org member list. We don’t have
the mapping and don’t want it.