Audit Trail

Every significant operation in the eduID Wallet Matching Portal is recorded in the audit_event table, an append-only log where records are only ever inserted, never updated or deleted. This design provides three critical capabilities: compliance evidence for GDPR accountability obligations, operational visibility for monitoring and incident investigation, and a complete, tamper-evident history of identity lifecycle events.

The audit trail is designed to be thorough without compromising privacy. Every event captures enough operational context to reconstruct what happened and why, but it never stores plaintext personal data. Subject identifiers are HMAC-hashed before being written to the audit log, and the JSON detail field contains operational metadata (session IDs, provider names, status transitions, error codes) but never names, email addresses, or other PII.

Audit Event Structure

Each audit event is a single row in the audit_event table with the following fields:

Field	Type	Description
`id`	TEXT (UUID)	Unique identifier for the event. Generated at the time the event is created. No two events share the same ID.
`tenant_id`	TEXT	The tenant that the event belongs to. All monitoring and compliance queries should filter by tenant to maintain multi-tenant isolation.
`event_type`	TEXT	Categorizes the event into one of the defined event types (see the complete list below). This field is the primary filter for operational dashboards and compliance reports.
`correlation_id`	TEXT	A UUID that links related events across a single user flow. For example, all events generated during one reconciliation session -- from session creation through IDV initiation to binding creation -- share the same correlation ID. This enables end-to-end tracing.
`subject_hash`	TEXT	HMAC-hashed identifier of the user (subject) involved in the event. This is not a plaintext identifier. It allows correlation of events for the same subject (same hash means same person) without revealing who the person is.
`client_id`	TEXT	The identifier of the API client or application that triggered the event. For external API operations, this is the client ID from the JWT bearer token. For user-initiated operations, this identifies the portal frontend.
`detail`	TEXT (JSON)	A structured JSON object containing additional operational context. The exact contents vary by event type but always follow the principle of containing operational metadata only, never PII. Examples include record counts, provider names, status codes, and error descriptions.
`created_at`	TEXT (ISO 8601)	The timestamp of when the event occurred, in ISO 8601 format with timezone. This field supports time-range queries for operational monitoring and historical investigation.

Privacy in the Audit Trail

The privacy design of the audit trail deserves special attention because audit logs are inherently long-lived and broadly accessed.

The subject_hash field contains an HMAC hash of the subject's identifier, computed using the same key (Key A or Key B) used for the main identity tables. This means even the audit trail does not contain PII. An investigator can correlate events for the same subject (same hash equals same person) without knowing who the person is. If identification is required for a specific investigation, the hash can be matched against the identity_match table -- but this requires deliberate action, appropriate authorization, and access to the KMS key.

The detail JSON contains operational context such as session IDs, provider names, status transitions, error codes, and record counts. It never contains names, email addresses, plaintext identifiers, or any other directly identifying information. This ensures that even if audit logs are exported to a SIEM or log aggregation system with broader access controls than the main database, no PII is exposed.

Event Types

The following table lists all audit event types, with descriptions of what each event represents and when it is logged.

Event Type	Description	When Logged
`session_created`	An OID4VP wallet authentication session has been initiated. The portal has generated a QR code and is waiting for the wallet to respond.	When `POST /auth/oid4vp/sessions` is called and a new session is created.
`session_completed`	A wallet authentication session has completed successfully. The wallet presented valid credentials and the system processed them.	When `POST /auth/oid4vp/sessions/{id}/complete` returns a 200 status.
`session_expired`	An OID4VP session timed out without being completed. The wallet holder did not present credentials within the session TTL.	When the background session cleanup job detects and removes an expired session.
`session_failed`	A wallet authentication session failed. This could be due to an invalid verifiable presentation, an unsupported credential type, or a verification error.	When VP verification fails or the session transitions to a FAILED state.
`idv_initiated`	An identity verification (IDV) reconciliation flow has been started. The user has been redirected to the upstream OIDC provider to authenticate.	When `POST .../idv/initiate` is called and the authorization URL is generated.
`idv_completed`	An IDV reconciliation flow has completed successfully. The user authenticated with the upstream provider, and the system received and processed the identity claims.	When the OIDC callback is processed, the token exchange succeeds, and the identity is resolved.
`idv_failed`	An IDV reconciliation flow failed. This could be due to a token exchange failure, missing required claims, or an error in the upstream provider's response.	When the token exchange fails, claim extraction encounters an error, or the provider returns an error.
`binding_created`	A new identity link binding has been created, linking a wallet holder to an institutional identity for the first time.	When the reconciliation process creates a new `identity_link_binding` record.
`binding_updated`	An existing identity link binding has been refreshed. The user completed re-reconciliation, and the binding's attributes or timestamps were updated.	When the reconciliation process updates an existing binding (e.g., after step-up authentication or attribute refresh).
`binding_soft_deleted`	A binding has been marked for deletion (soft-deleted). The record remains in the database for the retention period.	When the inactivity cleanup job or an administrative action soft-deletes a binding.
`binding_hard_deleted`	A binding has been permanently removed from the database. This is irreversible.	When the hard-delete cleanup job removes a soft-deleted binding that has exceeded the retention period.
`gdpr_erasure`	A complete identity deletion has been performed in response to a GDPR Article 17 (right to erasure) request. All match records, bindings, and auxiliary data for the identity have been permanently deleted.	When `DELETE /api/external/v1/reconciliation/{id}` is called and the deletion completes.
`external_api_lookup`	A third-party system has performed an identity lookup through the external API.	When `POST /api/external/v1/reconciliation/lookup` is called.
`auxiliary_data_write`	Auxiliary data has been stored or updated for an identity.	When `PUT .../auxiliary/{category}` creates or updates an auxiliary data record.
`auxiliary_data_delete`	Auxiliary data has been deleted for an identity.	When `DELETE .../auxiliary/{category}` removes an auxiliary data record.
`key_migration_started`	A key rotation migration has begun. The system is re-hashing or re-encrypting records with a new key version.	When the migration job starts processing and inserts a new `key_migration_history` record.
`key_migration_completed`	A key rotation migration has finished. The `detail` field includes the final counts (processed, failed, skipped, purged).	When the migration job completes and updates the `key_migration_history` record to COMPLETED.

Correlation

The correlation_id field is one of the most powerful features of the audit trail. It is a UUID that links related events across a single user flow, enabling an investigator to reconstruct the complete sequence of operations that occurred during a specific interaction.

Example: Complete Wallet Authentication with Reconciliation

A user scans a QR code, presents their wallet credential, and then completes the reconciliation flow by authenticating with SURF. The following sequence of correlated audit events would be generated:

session_created (correlation_id: abc-123) -- The OID4VP session is created and the QR code is displayed.
idv_initiated (correlation_id: abc-123) -- The user clicks "Link with institution account" and is redirected to SURF.
idv_completed (correlation_id: abc-123) -- The user authenticates with SURF, the callback is processed, and identity claims are received.
binding_created (correlation_id: abc-123) -- A new binding is created linking the wallet holder to their institutional identity.
session_completed (correlation_id: abc-123) -- The overall wallet authentication session is marked as completed.

By querying for all events with correlation_id = 'abc-123', an investigator can see the complete flow from start to finish, including the exact timestamps of each step, which provider was used, and whether any errors occurred along the way.

Querying Correlated Events

The named SqlDelight query for retrieving correlated events:

-- Named query: findAuditEventsByCorrelation(tenant_id, correlation_id)
SELECT * FROM audit_event
WHERE tenant_id = :tenant_id
  AND correlation_id = :correlation_id
ORDER BY created_at ASC;

The idx_audit_correlation index on the correlation_id column ensures this query executes efficiently even against a large audit table.

Querying by Event Type

For operational monitoring and compliance reporting, events can be queried by type with pagination support:

-- Named query: findAuditEventsByType(tenant_id, event_type, limit, offset)
SELECT * FROM audit_event
WHERE tenant_id = :tenant_id
  AND event_type = :event_type
ORDER BY created_at DESC
LIMIT :limit OFFSET :offset;

The idx_audit_type index on (tenant_id, event_type) supports efficient filtering by event type within a tenant.

To generate a compliance report of all GDPR erasure events in the last 30 days:

SELECT * FROM audit_event
WHERE tenant_id = 'kw1c'
  AND event_type = 'gdpr_erasure'
  AND created_at >= '2026-02-25T00:00:00Z'
ORDER BY created_at DESC;

This query uses both the idx_audit_type index (for tenant and event type filtering) and the idx_audit_time index (for the time range predicate) to execute efficiently.

Useful Query Patterns

The following table suggests which event types to query for common operational and compliance tasks:

Purpose	Event Type Filter
Authentication activity overview	`session_completed`, `session_failed`
Reconciliation activity	`idv_initiated`, `idv_completed`, `idv_failed`
Data lifecycle monitoring	`binding_created`, `binding_soft_deleted`, `binding_hard_deleted`
GDPR compliance reporting	`gdpr_erasure`
External API usage tracking	`external_api_lookup`, `auxiliary_data_write`
Key management operations	`key_migration_started`, `key_migration_completed`
Security incident investigation	`session_failed`, `idv_failed` (high volume may indicate an attack)

Retention

Audit events are retained indefinitely by default. There is no automatic cleanup or expiration mechanism for audit records. This is an intentional design decision based on several considerations:

Regulatory compliance: Different jurisdictions and regulatory frameworks may require different retention periods. Rather than imposing a specific period, the system retains all events and delegates the retention policy decision to the deploying organization.
GDPR accountability: Article 5(2) of the GDPR requires controllers to demonstrate compliance with data processing principles. The audit trail provides evidence of lawful processing, and this evidence may need to be produced years after the events occurred.
Incident investigation: Security incidents may not be discovered until months or years after they occur. A complete audit trail is essential for forensic analysis and determining the scope and impact of a breach.

Organizations should define a retention policy based on their specific regulatory requirements and configure periodic archival or deletion accordingly. A common approach is to archive events older than a specified period (e.g., 5 years) to cold storage, and delete events older than the maximum regulatory requirement (e.g., 10 years).

Note that because audit events contain no plaintext PII (only HMAC-hashed subject identifiers), the GDPR storage limitation principle (Article 5(1)(e)) is less directly applicable. The hashed identifiers cannot be used to identify individuals without access to the HMAC keys, so the privacy impact of long-term retention is minimal.

Immutability

The audit trail is append-only by design. In the application layer, only INSERT operations are ever performed on the audit_event table. The SqlDelight schema defines no UPDATE or DELETE queries for audit events. This application-level guarantee means that under normal operation, once an audit event is written, it cannot be modified or removed.

However, application-level guarantees are insufficient to protect against scenarios where an attacker (or a malicious insider) gains direct database access and executes raw SQL. For production deployments, additional immutability guarantees are strongly recommended.

Database-Level Protection

A PostgreSQL trigger can enforce immutability at the database engine level, preventing UPDATE and DELETE operations regardless of how they are issued:

CREATE OR REPLACE FUNCTION prevent_audit_modification()
RETURNS TRIGGER AS $$
BEGIN
    RAISE EXCEPTION 'audit_event table is append-only: % not allowed', TG_OP;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER audit_immutability
    BEFORE UPDATE OR DELETE ON audit_event
    FOR EACH ROW
    EXECUTE FUNCTION prevent_audit_modification();

This trigger causes any UPDATE or DELETE operation on the audit_event table to fail with an exception, regardless of whether the operation originates from the application or from a direct SQL session. Even a database superuser cannot silently modify audit records without first dropping the trigger, which would itself be logged in the PostgreSQL activity log.

External Immutable Storage

For the strongest audit guarantees, forward audit events to an external immutable store in addition to the database. This provides defense-in-depth against scenarios where an attacker gains database superuser access and could bypass even trigger-based protections.

Recommended external immutable storage options:

Azure Append Blobs: Azure Blob Storage in append-only mode prevents any modification of written data. Each audit event is appended as a new block, and the blob cannot be modified or deleted while the immutability policy is in effect.
AWS S3 Object Lock: S3 Object Lock in WORM (Write Once Read Many) compliance mode prevents object deletion or modification for a specified retention period. This provides a regulatory-grade immutability guarantee.
Dedicated SIEM: Security Information and Event Management systems (such as Splunk, Azure Sentinel, or Elastic Security) with immutable log storage provide both immutability and sophisticated querying, alerting, and correlation capabilities.

The external store serves as an independent verification source. If there is any suspicion that the database audit trail has been tampered with, the external store provides an authoritative reference.

Audit Event Example

The following is a complete example of a GDPR erasure audit event as it would appear in the database:

{
  "id": "evt-550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "kw1c",
  "event_type": "gdpr_erasure",
  "correlation_id": "req-661e9500-f39c-52e5-b827-557766551111",
  "subject_hash": "zQmXk7f9a2bC3dE4fG5hI6jK7lM8nO9pQ0rS1tU2vW3xY4z",
  "client_id": "student-info-system",
  "detail": "{\"matches_deleted\":2,\"bindings_deleted\":1,\"auxiliary_deleted\":3,\"requested_by\":\"privacy-officer\"}",
  "created_at": "2026-03-27T14:30:00Z"
}

Key observations about this event:

The subject_hash is an HMAC hash, not a plaintext identifier. An investigator can search for other events with the same hash to find all operations related to this subject, but cannot determine the subject's identity from the hash alone.
The detail JSON contains counts (matches_deleted, bindings_deleted, auxiliary_deleted) and operational context (requested_by) but no PII. The requested_by field identifies the role that authorized the request, not the individual.
The client_id identifies which system initiated the request (student-info-system), enabling tracking of which systems are exercising the erasure API.
The correlation_id links this event to any other events that were generated as part of the same API request, enabling full reconstruction of the erasure operation.

Database Schema

The audit event table and its indexes are defined as follows:

CREATE TABLE IF NOT EXISTS audit_event (
    id              TEXT NOT NULL PRIMARY KEY,
    tenant_id       TEXT NOT NULL,
    event_type      TEXT NOT NULL,
    correlation_id  TEXT,
    subject_hash    TEXT,
    client_id       TEXT,
    detail          TEXT,
    created_at      TEXT NOT NULL
);

CREATE INDEX IF NOT EXISTS idx_audit_type
    ON audit_event(tenant_id, event_type);
CREATE INDEX IF NOT EXISTS idx_audit_time
    ON audit_event(created_at);
CREATE INDEX IF NOT EXISTS idx_audit_correlation
    ON audit_event(correlation_id);

Three indexes optimize the most common query patterns:

idx_audit_type on (tenant_id, event_type): Supports filtering by event type within a tenant. This is the most common query pattern for operational dashboards and compliance reports (e.g., "show all GDPR erasure events for tenant X in the last 30 days").
idx_audit_time on (created_at): Supports time-range queries across all tenants. Used for operational monitoring (e.g., "show all events in the last hour") and historical investigation (e.g., "show all events between March 1 and March 15").
idx_audit_correlation on (correlation_id): Supports reconstruction of complete user flows from a single correlation ID. This is the primary tool for incident investigation and debugging.

As the audit table grows over time (it is never cleaned up by default), these indexes will also grow. For very large deployments, consider partitioning the audit_event table by created_at (e.g., monthly partitions) to manage index size and enable efficient archival of historical partitions.

Audit Event Structure​

Privacy in the Audit Trail​

Event Types​

Correlation​

Example: Complete Wallet Authentication with Reconciliation​

Querying Correlated Events​

Querying by Event Type​

Example: GDPR Erasure Report​

Useful Query Patterns​

Retention​

Immutability​

Database-Level Protection​

External Immutable Storage​

Audit Event Example​

Database Schema​