Skip to content

Data model — what we keep and don't

openai_users

Replaced wholesale per pull cycle.

FieldSourceHashed?
openai_user_iduser_...no
email_hashSHA256(lowercase email)
email_prefixFirst 4 chars before @ + ’…‘partial
role”owner” / “reader”no
added_atUnix → TIMESTAMPTZno

openai_projects

FieldSourceHashed?
openai_project_idproj_...no
nameOperator-supplied labelno
status”active” / “archived”no
created_at_openaiUnix → TIMESTAMPTZno
archived_atUnix → TIMESTAMPTZno

openai_api_keys

FieldSourceHashed?
openai_key_idkey_...no
nameOperator labelno
type”user” / “service_account”no
redacted_valuesk-...xxxx (last 4 chars)partial
owner_user_idFor user-keys: user_...no
created_at_openaiUnix → TIMESTAMPTZno
last_used_atUnix → TIMESTAMPTZ or NULLno

openai_events

Append-only. Idempotent on (connection_id, openai_event_id).

FieldSourceHashed?
openai_event_idOpenAI id (audit_log_…)no
event_typeOpenAI type fieldno
actor_email_hashSHA256(lowercase actor.session.user.email)
actor_email_prefixFirst 4 chars + ’…‘partial
target_idproject.id (if event is project-scoped)no
occurred_atOpenAI effective_at (Unix)no
payload_jsonRest of payload, PII fields droppedpartial

Filtered payload fields

These keys are dropped from the raw payload before storage:

  • actor, user, email, actor_email, session — already stored as hashes
  • name, full_name — display names

What we DO NOT have

  • Full email addresses — dropped at ingest
  • Full API key value — only sk-...4-char redacted (the real value lives only at OpenAI; we don’t receive it via the API)
  • What an API key calledlast_used_at is the only usage signal; OpenAI doesn’t expose per-key request logs
  • Prompt/completion content — not in admin API
  • ChatGPT browser conversations — requires Compliance API (Q4 2026)

Mapping a hash back to a person

The email_prefix shows jan… for jan.peeters@company.be. Match this prefix against your HR system or LDAP/Azure AD. We don’t have the mapping and don’t want it.