Audit Logging
The EDK's audit system records a structured trail of every significant action in the application, who did what, on which resource, when, whether it succeeded, and why it was denied. Unlike general-purpose application logging, audit logs are designed for compliance, security review, and forensic analysis.
Audit logging operates independently from the event system. Events are for real-time pub/sub within the application (triggering workflows, updating caches, forwarding to external systems). Audit logs are append-only records optimized for after-the-fact investigation, "show me everything user X did in the last 30 days" or "which commands were denied by the policy engine yesterday."
Key Design Principles
Never fail the command. Audit logging is critical but not blocking. If the audit store is unavailable, the command still executes. Sink failures are caught and logged to the application logger, never propagated to the caller. A failing audit system should degrade gracefully, not take down production.
Redact before persisting. Sensitive data (passwords, tokens, API keys, JWTs) is automatically stripped from audit metadata before it reaches any sink. The AuditRedactionPolicy catches common patterns, password, token, secret, credential, api_key, jwt, authorization, bearer, private_key, and replaces their values with [REDACTED]. Error messages are similarly sanitized to remove embedded credentials.
Enrich from session context. Audit events are enriched with tenant ID, actor ID, and correlation ID from the current session context. Code that records audit events doesn't need to manually pass this information, the enrichment layer fills it in automatically from the request's session.
Append-only storage. The ImmutableAuditStore interface enforces append-only semantics, events can be written and queried, but never updated or deleted. In the PostgreSQL implementation, database triggers guard against modification, providing an additional layer of protection against tampering.
What Gets Recorded
Every audit event captures:
| Field | Description |
|---|---|
commandId | Which command was executed (e.g., kms.keys.generate, party.manager.create) |
result | Outcome: STARTED, SUCCEEDED, FAILED, POLICY_DENIED, AUTH_FAILURE |
actorId | Who performed the action (enriched from session principal) |
tenantId | Which tenant context (enriched from session) |
resourceId | What was acted on |
durationMs | How long the command took |
correlationId | Links related audit events across a request lifecycle |
traceId / spanId | W3C Trace Context for distributed tracing correlation |
metadata | Additional key-value pairs (redacted before persistence) |
errorCode / errorMessage | Failure details (sanitized to remove credentials) |
transportType / transportScope | Whether the command was local, HTTP, or gRPC |
The result field distinguishes between different failure modes: FAILED means the command threw an error, POLICY_DENIED means the authorization layer blocked it, and AUTH_FAILURE means the caller couldn't be authenticated. This distinction matters for security analysis, a spike in POLICY_DENIED events is a different signal than a spike in FAILED events.
Output Formats
Audit events can be formatted in three standards, depending on what your SIEM or log aggregation system expects:
JSON: structured JSON objects. The default format, suitable for Elasticsearch, Splunk, Datadog, and most modern log platforms. Every field is a named key, metadata is a nested object.
CEF (ArcSight Common Event Format), the standard format for ArcSight, QRadar, and legacy SIEM systems. Events are formatted as pipe-delimited strings with severity mapping: STARTED=1, SUCCEEDED=3, FAILED=7, POLICY_DENIED=8, AUTH_FAILURE=9. Custom field labels carry tenant, trace, and correlation IDs.
OCSF (Open Cybersecurity Schema Framework), the emerging standard for security telemetry. Events are mapped to the API Activity class (class_uid=6003) with proper OCSF severity levels and status codes. Includes full product/vendor metadata for automated classification.
All three formatters implement the AuditFormatter interface, so you can register multiple formatters simultaneously, for example, JSON to your log aggregator and CEF to your SIEM.
Tamper Evidence
For regulated environments that require proof that audit logs haven't been altered, the audit system supports two complementary mechanisms:
Hash chaining links each audit event to the previous one using a cryptographic hash chain. Every event is hashed, and its hash is combined with the previous event's hash. If any event in the chain is modified, the chain breaks and verification fails. The chain starts from a genesis hash and can be verified for any range of events.
Signed checkpoints periodically capture a signed digest of the chain. At configurable intervals (default every 1,000 events), a checkpoint is created containing the chain digest, sequence range, timestamp, and a KMS signature. Checkpoints can be verified independently, you can prove that all events between checkpoint A and checkpoint B are unmodified without recomputing the entire chain.
data class TamperEvidenceConfig(
val hashChainEnabled: Boolean = false,
val checkpointSigningEnabled: Boolean = false,
val hashAlgorithm: String = "SHA-256",
val checkpointIntervalEvents: Int = 1000,
val signingKeyRef: String = ""
)
The AuditChainVerifier can verify any range of the chain and any individual checkpoint:
interface AuditChainVerifier {
suspend fun verifyChain(tenantId: String, fromSequence: Long, toSequence: Long?): IdkResult<Long, IdkError>
suspend fun verifyCheckpoint(tenantId: String, checkpointId: String): IdkResult<Unit, IdkError>
}
Processing Pipeline
When an audit event is recorded, it passes through a three-stage pipeline:
Each stage is isolated, enrichment failures don't prevent redaction, and redaction failures don't prevent persistence. The entire pipeline is wrapped in a catch-all exception handler so that audit infrastructure failures never propagate to the caller.
Querying Audit Logs
The AuditQueryService provides structured queries over persisted audit events:
interface AuditQueryService {
suspend fun query(filter: AuditQueryFilter): IdkResult<List<AuditEvent>, IdkError>
suspend fun getByCorrelationId(tenantId: String, correlationId: String): IdkResult<List<AuditEvent>, IdkError>
suspend fun getByTraceId(tenantId: String, traceId: String): IdkResult<List<AuditEvent>, IdkError>
}
Queries support filtering by tenant, command, module, actor, trace ID, result type, and time range, with pagination via limit/offset. The getByCorrelationId query is particularly useful for reconstructing the full lifecycle of a request across multiple commands.
Modules
| Module | Description |
|---|---|
lib-audit-public | AuditEvent, AuditLogService, AuditEventSink, formatters, redaction policy, store and query interfaces |
lib-audit-impl | DefaultAuditLogService, session enricher, JSON/CEF/OCSF formatters, composite sink, tamper evidence |