Audit Trail
Every significant operation in the eduID Wallet Matching Portal is recorded in the audit_event table, an append-only log where records are only ever inserted, never updated or deleted. This design provides three critical capabilities: compliance evidence for GDPR accountability obligations, operational visibility for monitoring and incident investigation, and a complete, tamper-evident history of identity lifecycle events.
The audit trail is designed to be thorough without compromising privacy. Every event captures enough operational context to reconstruct what happened and why, but it never stores plaintext personal data. Subject identifiers are HMAC-hashed before being written to the audit log, and the JSON detail field contains operational metadata (session IDs, provider names, status transitions, error codes) but never names, email addresses, or other PII.
Audit Event Structure
Each audit event is a single row in the audit_event table with the following fields:
| Field | Type | Description |
|---|---|---|
id | TEXT (UUID) | Unique identifier for the event. Generated at the time the event is created. No two events share the same ID. |
tenant_id | TEXT | The tenant that the event belongs to. All monitoring and compliance queries should filter by tenant to maintain multi-tenant isolation. |
event_type | TEXT | Categorizes the event into one of the defined event types (see the complete list below). This field is the primary filter for operational dashboards and compliance reports. |
correlation_id | TEXT | A UUID that links related events across a single user flow. For example, all events generated during one reconciliation session -- from session creation through IDV initiation to binding creation -- share the same correlation ID. This enables end-to-end tracing. |
subject_hash | TEXT | HMAC-hashed identifier of the user (subject) involved in the event. This is not a plaintext identifier. It allows correlation of events for the same subject (same hash means same person) without revealing who the person is. |
client_id | TEXT | The identifier of the API client or application that triggered the event. For external API operations, this is the client ID from the JWT bearer token. For user-initiated operations, this identifies the portal frontend. |
detail | TEXT (JSON) | A structured JSON object containing additional operational context. The exact contents vary by event type but always follow the principle of containing operational metadata only, never PII. Examples include record counts, provider names, status codes, and error descriptions. |
created_at | TEXT (ISO 8601) | The timestamp of when the event occurred, in ISO 8601 format with timezone. This field supports time-range queries for operational monitoring and historical investigation. |
Privacy in the Audit Trail
The privacy design of the audit trail deserves special attention because audit logs are inherently long-lived and broadly accessed.
The subject_hash field contains an HMAC hash of the subject's identifier, computed using the same key (Key A or Key B) used for the main identity tables. This means even the audit trail does not contain PII. An investigator can correlate events for the same subject (same hash equals same person) without knowing who the person is. If identification is required for a specific investigation, the hash can be matched against the identity_match table -- but this requires deliberate action, appropriate authorization, and access to the KMS key.
The detail JSON contains operational context such as session IDs, provider names, status transitions, error codes, and record counts. It never contains names, email addresses, plaintext identifiers, or any other directly identifying information. This ensures that even if audit logs are exported to a SIEM or log aggregation system with broader access controls than the main database, no PII is exposed.
Event Types
The following table lists all audit event types, with descriptions of what each event represents and when it is logged.
| Event Type | Description | When Logged |
|---|---|---|
session_created | An OID4VP wallet authentication session has been initiated. The portal has generated a QR code and is waiting for the wallet to respond. | When POST /auth/oid4vp/sessions is called and a new session is created. |
session_completed | A wallet authentication session has completed successfully. The wallet presented valid credentials and the system processed them. | When POST /auth/oid4vp/sessions/{id}/complete returns a 200 status. |
session_expired | An OID4VP session timed out without being completed. The wallet holder did not present credentials within the session TTL. | When the background session cleanup job detects and removes an expired session. |
session_failed | A wallet authentication session failed. This could be due to an invalid verifiable presentation, an unsupported credential type, or a verification error. | When VP verification fails or the session transitions to a FAILED state. |
idv_initiated | An identity verification (IDV) reconciliation flow has been started. The user has been redirected to the upstream OIDC provider to authenticate. | When POST .../idv/initiate is called and the authorization URL is generated. |
idv_completed | An IDV reconciliation flow has completed successfully. The user authenticated with the upstream provider, and the system received and processed the identity claims. | When the OIDC callback is processed, the token exchange succeeds, and the identity is resolved. |
idv_failed | An IDV reconciliation flow failed. This could be due to a token exchange failure, missing required claims, or an error in the upstream provider's response. | When the token exchange fails, claim extraction encounters an error, or the provider returns an error. |
binding_created | A new identity link binding has been created, linking a wallet holder to an institutional identity for the first time. | When the reconciliation process creates a new identity_link_binding record. |
binding_updated | An existing identity link binding has been refreshed. The user completed re-reconciliation, and the binding's attributes or timestamps were updated. | When the reconciliation process updates an existing binding (e.g., after step-up authentication or attribute refresh). |
binding_soft_deleted | A binding has been marked for deletion (soft-deleted). The record remains in the database for the retention period. | When the inactivity cleanup job or an administrative action soft-deletes a binding. |
binding_hard_deleted | A binding has been permanently removed from the database. This is irreversible. | When the hard-delete cleanup job removes a soft-deleted binding that has exceeded the retention period. |
gdpr_erasure | A complete identity deletion has been performed in response to a GDPR Article 17 (right to erasure) request. All match records, bindings, and auxiliary data for the identity have been permanently deleted. | When DELETE /api/external/v1/reconciliation/{id} is called and the deletion completes. |
external_api_lookup | A third-party system has performed an identity lookup through the external API. | When POST /api/external/v1/reconciliation/lookup is called. |
auxiliary_data_write | Auxiliary data has been stored or updated for an identity. | When PUT .../auxiliary/{category} creates or updates an auxiliary data record. |
auxiliary_data_delete | Auxiliary data has been deleted for an identity. | When DELETE .../auxiliary/{category} removes an auxiliary data record. |
key_migration_started | A key rotation migration has begun. The system is re-hashing or re-encrypting records with a new key version. | When the migration job starts processing and inserts a new key_migration_history record. |
key_migration_completed | A key rotation migration has finished. The detail field includes the final counts (processed, failed, skipped, purged). | When the migration job completes and updates the key_migration_history record to COMPLETED. |
Correlation
The correlation_id field is one of the most powerful features of the audit trail. It is a UUID that links related events across a single user flow, enabling an investigator to reconstruct the complete sequence of operations that occurred during a specific interaction.
Example: Complete Wallet Authentication with Reconciliation
A user scans a QR code, presents their wallet credential, and then completes the reconciliation flow by authenticating with SURF. The following sequence of correlated audit events would be generated:
session_created(correlation_id:abc-123) -- The OID4VP session is created and the QR code is displayed.idv_initiated(correlation_id:abc-123) -- The user clicks "Link with institution account" and is redirected to SURF.idv_completed(correlation_id:abc-123) -- The user authenticates with SURF, the callback is processed, and identity claims are received.binding_created(correlation_id:abc-123) -- A new binding is created linking the wallet holder to their institutional identity.session_completed(correlation_id:abc-123) -- The overall wallet authentication session is marked as completed.
By querying for all events with correlation_id = 'abc-123', an investigator can see the complete flow from start to finish, including the exact timestamps of each step, which provider was used, and whether any errors occurred along the way.
Querying Correlated Events
The named SqlDelight query for retrieving correlated events:
-- Named query: findAuditEventsByCorrelation(tenant_id, correlation_id)
SELECT * FROM audit_event
WHERE tenant_id = :tenant_id
AND correlation_id = :correlation_id
ORDER BY created_at ASC;
The idx_audit_correlation index on the correlation_id column ensures this query executes efficiently even against a large audit table.
Querying by Event Type
For operational monitoring and compliance reporting, events can be queried by type with pagination support:
-- Named query: findAuditEventsByType(tenant_id, event_type, limit, offset)
SELECT * FROM audit_event
WHERE tenant_id = :tenant_id
AND event_type = :event_type
ORDER BY created_at DESC
LIMIT :limit OFFSET :offset;
The idx_audit_type index on (tenant_id, event_type) supports efficient filtering by event type within a tenant.
Example: GDPR Erasure Report
To generate a compliance report of all GDPR erasure events in the last 30 days:
SELECT * FROM audit_event
WHERE tenant_id = 'kw1c'
AND event_type = 'gdpr_erasure'
AND created_at >= '2026-02-25T00:00:00Z'
ORDER BY created_at DESC;
This query uses both the idx_audit_type index (for tenant and event type filtering) and the idx_audit_time index (for the time range predicate) to execute efficiently.
Useful Query Patterns
The following table suggests which event types to query for common operational and compliance tasks:
| Purpose | Event Type Filter |
|---|---|
| Authentication activity overview | session_completed, session_failed |
| Reconciliation activity | idv_initiated, idv_completed, idv_failed |
| Data lifecycle monitoring | binding_created, binding_soft_deleted, binding_hard_deleted |
| GDPR compliance reporting | gdpr_erasure |
| External API usage tracking | external_api_lookup, auxiliary_data_write |
| Key management operations | key_migration_started, key_migration_completed |
| Security incident investigation | session_failed, idv_failed (high volume may indicate an attack) |
Retention
Audit events are retained indefinitely by default. There is no automatic cleanup or expiration mechanism for audit records. This is an intentional design decision based on several considerations:
- Regulatory compliance: Different jurisdictions and regulatory frameworks may require different retention periods. Rather than imposing a specific period, the system retains all events and delegates the retention policy decision to the deploying organization.
- GDPR accountability: Article 5(2) of the GDPR requires controllers to demonstrate compliance with data processing principles. The audit trail provides evidence of lawful processing, and this evidence may need to be produced years after the events occurred.
- Incident investigation: Security incidents may not be discovered until months or years after they occur. A complete audit trail is essential for forensic analysis and determining the scope and impact of a breach.
Organizations should define a retention policy based on their specific regulatory requirements and configure periodic archival or deletion accordingly. A common approach is to archive events older than a specified period (e.g., 5 years) to cold storage, and delete events older than the maximum regulatory requirement (e.g., 10 years).
Note that because audit events contain no plaintext PII (only HMAC-hashed subject identifiers), the GDPR storage limitation principle (Article 5(1)(e)) is less directly applicable. The hashed identifiers cannot be used to identify individuals without access to the HMAC keys, so the privacy impact of long-term retention is minimal.
Immutability
The audit trail is append-only by design. In the application layer, only INSERT operations are ever performed on the audit_event table. The SqlDelight schema defines no UPDATE or DELETE queries for audit events. This application-level guarantee means that under normal operation, once an audit event is written, it cannot be modified or removed.
However, application-level guarantees are insufficient to protect against scenarios where an attacker (or a malicious insider) gains direct database access and executes raw SQL. For production deployments, additional immutability guarantees are strongly recommended.
Database-Level Protection
A PostgreSQL trigger can enforce immutability at the database engine level, preventing UPDATE and DELETE operations regardless of how they are issued:
CREATE OR REPLACE FUNCTION prevent_audit_modification()
RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'audit_event table is append-only: % not allowed', TG_OP;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER audit_immutability
BEFORE UPDATE OR DELETE ON audit_event
FOR EACH ROW
EXECUTE FUNCTION prevent_audit_modification();
This trigger causes any UPDATE or DELETE operation on the audit_event table to fail with an exception, regardless of whether the operation originates from the application or from a direct SQL session. Even a database superuser cannot silently modify audit records without first dropping the trigger, which would itself be logged in the PostgreSQL activity log.
External Immutable Storage
For the strongest audit guarantees, forward audit events to an external immutable store in addition to the database. This provides defense-in-depth against scenarios where an attacker gains database superuser access and could bypass even trigger-based protections.
Recommended external immutable storage options:
- Azure Append Blobs: Azure Blob Storage in append-only mode prevents any modification of written data. Each audit event is appended as a new block, and the blob cannot be modified or deleted while the immutability policy is in effect.
- AWS S3 Object Lock: S3 Object Lock in WORM (Write Once Read Many) compliance mode prevents object deletion or modification for a specified retention period. This provides a regulatory-grade immutability guarantee.
- Dedicated SIEM: Security Information and Event Management systems (such as Splunk, Azure Sentinel, or Elastic Security) with immutable log storage provide both immutability and sophisticated querying, alerting, and correlation capabilities.
The external store serves as an independent verification source. If there is any suspicion that the database audit trail has been tampered with, the external store provides an authoritative reference.
Audit Event Example
The following is a complete example of a GDPR erasure audit event as it would appear in the database:
{
"id": "evt-550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "kw1c",
"event_type": "gdpr_erasure",
"correlation_id": "req-661e9500-f39c-52e5-b827-557766551111",
"subject_hash": "zQmXk7f9a2bC3dE4fG5hI6jK7lM8nO9pQ0rS1tU2vW3xY4z",
"client_id": "student-info-system",
"detail": "{\"matches_deleted\":2,\"bindings_deleted\":1,\"auxiliary_deleted\":3,\"requested_by\":\"privacy-officer\"}",
"created_at": "2026-03-27T14:30:00Z"
}
Key observations about this event:
- The
subject_hashis an HMAC hash, not a plaintext identifier. An investigator can search for other events with the same hash to find all operations related to this subject, but cannot determine the subject's identity from the hash alone. - The
detailJSON contains counts (matches_deleted,bindings_deleted,auxiliary_deleted) and operational context (requested_by) but no PII. Therequested_byfield identifies the role that authorized the request, not the individual. - The
client_ididentifies which system initiated the request (student-info-system), enabling tracking of which systems are exercising the erasure API. - The
correlation_idlinks this event to any other events that were generated as part of the same API request, enabling full reconstruction of the erasure operation.
Database Schema
The audit event table and its indexes are defined as follows:
CREATE TABLE IF NOT EXISTS audit_event (
id TEXT NOT NULL PRIMARY KEY,
tenant_id TEXT NOT NULL,
event_type TEXT NOT NULL,
correlation_id TEXT,
subject_hash TEXT,
client_id TEXT,
detail TEXT,
created_at TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_audit_type
ON audit_event(tenant_id, event_type);
CREATE INDEX IF NOT EXISTS idx_audit_time
ON audit_event(created_at);
CREATE INDEX IF NOT EXISTS idx_audit_correlation
ON audit_event(correlation_id);
Three indexes optimize the most common query patterns:
- idx_audit_type on
(tenant_id, event_type): Supports filtering by event type within a tenant. This is the most common query pattern for operational dashboards and compliance reports (e.g., "show all GDPR erasure events for tenant X in the last 30 days"). - idx_audit_time on
(created_at): Supports time-range queries across all tenants. Used for operational monitoring (e.g., "show all events in the last hour") and historical investigation (e.g., "show all events between March 1 and March 15"). - idx_audit_correlation on
(correlation_id): Supports reconstruction of complete user flows from a single correlation ID. This is the primary tool for incident investigation and debugging.
As the audit table grows over time (it is never cleaned up by default), these indexes will also grow. For very large deployments, consider partitioning the audit_event table by created_at (e.g., monthly partitions) to manage index size and enable efficient archival of historical partitions.