Data model — what we keep and don't
openai_users
Replaced wholesale per pull cycle.
| Field | Source | Hashed? |
|---|---|---|
openai_user_id | user_... | no |
email_hash | SHA256(lowercase email) | ✓ |
email_prefix | First 4 chars before @ + ’…‘ | partial |
role | ”owner” / “reader” | no |
added_at | Unix → TIMESTAMPTZ | no |
openai_projects
| Field | Source | Hashed? |
|---|---|---|
openai_project_id | proj_... | no |
name | Operator-supplied label | no |
status | ”active” / “archived” | no |
created_at_openai | Unix → TIMESTAMPTZ | no |
archived_at | Unix → TIMESTAMPTZ | no |
openai_api_keys
| Field | Source | Hashed? |
|---|---|---|
openai_key_id | key_... | no |
name | Operator label | no |
type | ”user” / “service_account” | no |
redacted_value | sk-...xxxx (last 4 chars) | partial |
owner_user_id | For user-keys: user_... | no |
created_at_openai | Unix → TIMESTAMPTZ | no |
last_used_at | Unix → TIMESTAMPTZ or NULL | no |
openai_events
Append-only. Idempotent on (connection_id, openai_event_id).
| Field | Source | Hashed? |
|---|---|---|
openai_event_id | OpenAI id (audit_log_…) | no |
event_type | OpenAI type field | no |
actor_email_hash | SHA256(lowercase actor.session.user.email) | ✓ |
actor_email_prefix | First 4 chars + ’…‘ | partial |
target_id | project.id (if event is project-scoped) | no |
occurred_at | OpenAI effective_at (Unix) | no |
payload_json | Rest of payload, PII fields dropped | partial |
Filtered payload fields
These keys are dropped from the raw payload before storage:
actor,user,email,actor_email,session— already stored as hashesname,full_name— display names
What we DO NOT have
- Full email addresses — dropped at ingest
- Full API key value — only
sk-...4-char redacted(the real value lives only at OpenAI; we don’t receive it via the API) - What an API key called —
last_used_atis the only usage signal; OpenAI doesn’t expose per-key request logs - Prompt/completion content — not in admin API
- ChatGPT browser conversations — requires Compliance API (Q4 2026)
Mapping a hash back to a person
The email_prefix shows jan… for jan.peeters@company.be. Match this
prefix against your HR system or LDAP/Azure AD. We don’t have the
mapping and don’t want it.