Tenant Isolation
Multi-tenant deployments stand or fall on isolation. The EDK enforces tenant isolation through four overlapping mechanisms: row-level isolation in the shared database, per-tenant database routing for deployments that need stricter separation, per-tenant signing keys on the KMS, and authorisation scope on every admin and protocol command.
None of these is the single source of isolation; each one closes a different class of failure.
Row-Level Isolation
The default operating mode is one shared PostgreSQL instance with row-level tenant isolation. Every business table carries a tenant_id column; every repository query filters on it; every unique constraint that needs to be tenant-scoped (UNIQUE (tenant_id, ...)) carries the tenant id as part of the key.
The mechanism is enforced at three layers:
- Schema. The
tenant_idcolumn isNOT NULLon every business table. Inserts without a tenant id fail at the database level. - Repository. Every repository operation that returns data accepts the tenant id as a parameter (for
AppScoperepositories) or reads it from the resolved tenant context (forSessionScoperepositories). A query that forgets to filter is a compile error in the typed SQLDelight bindings. - Service command. Every
SessionScopeservice command runs inside a tenant context; it never accepts a tenant id as a method argument. TheSessionExecutionprovides the resolved tenant, and the command queries through the per-tenant repository binding.
The result is that a tenant cannot, by construction, see another tenant's rows. A cross-tenant access attempt either fails the unique-constraint check (the tenant id does not match), returns an empty result set (the filter is correct), or raises an IllegalAccessError (the resolved tenant does not match the asserted tenant on the call).
Per-Tenant Database Routing
For deployments that require stricter separation than row-level, the lib-data-store-db-routing-config and lib-data-store-db-routing-pooling modules read a per-tenant database routing table from configuration and route a request bound to a specific tenant to a specific JDBC URL.
The configuration shape:
database:
routing:
default:
url: jdbc:postgresql://postgres-shared:5432/edk
username: edk_app
password: ${secret:vault:edk/postgres/shared/password}
tenants:
acme:
url: jdbc:postgresql://postgres-acme:5432/edk
username: edk_app_acme
password: ${secret:vault:edk/postgres/acme/password}
regulated-customer:
url: jdbc:postgresql://postgres-regulated:5432/edk
username: edk_app_regulated
password: ${secret:vault:edk/postgres/regulated/password}
Per-target connection pooling is via HikariCP. The container code is unchanged; the same repository operations route to whichever target the resolved tenant points at.
Three common patterns:
- Single shared database, row-level isolation. Default. Most deployments. Cheaper, simpler operations, easier backup.
- Per-tenant database, shared Postgres instance. One Postgres instance hosts multiple databases, one per tenant. Provides database-level isolation (a tenant's data is in their database, not in a row of a shared table). Same Postgres ops story; slightly more per-tenant provisioning work.
- Per-tenant Postgres instance. One Postgres per tenant. Maximum isolation. Used for regulated or contractual cases where a tenant's data physically must not share a database server with another tenant's data.
The choice is per-deployment; routing entries can mix and match, so a deployment can default-share most tenants and isolate the regulated ones.
The routing table is itself a tenant config source contribution: adding a new tenant routing entry through the admin REST takes effect on the next resolver cache miss and through the cross-replica invalidation channel, without restarting the container.
The TenantProvisioner
TenantProvisioner is the SPI that provisions the per-tenant storage container at registration time. For shared-database row-level isolation, the provisioner is a no-op: the row gets the new tenant_id, no schema work is needed. For per-tenant-database or per-tenant-instance isolation, the provisioner runs the JDBC maintenance:
- Create database.
CREATE DATABASE tenant_acmeon the target Postgres host. - Run schema migrations. Each EDK module that ships SQLDelight schemas runs its initial migration against the new database.
- Create application user. Provision the
edk_app_acmerole with the right grants on the new database. - Seed deployment-wide defaults. Any seed data that should exist before the first tenant request lands.
JdbcMaintenanceJdbcExecutor is the JVM implementation that runs these via raw JDBC against a maintenance connection that holds CREATE DATABASE rights. NoopMaintenanceJdbcExecutor is the binding for shared-database deployments; the provisioner does nothing.
The TenantIsolationStrategySelector chooses which provisioner runs for a given registration. The default selects per the deployment-wide configuration; a deployment can replace the selector to make the isolation strategy per-tenant (acme gets shared, regulated-customer gets per-database).
Per-Tenant Signing Keys
Every signing duty on the data planes maps to a per-tenant KMS alias of the form (tenant, service, purpose):
- Issuer credential signing:
(acme, issuer, credential-signing). - AS access token signing:
(acme, as, access-token-signing). - AS id token signing:
(acme, as, id-token-signing). - Verifier request-object signing:
(acme, verifier, request-object-signing). - Audit checkpoint signing:
(acme, audit, audit-checkpoint). - Webhook HMAC:
(acme, <service>, webhook-signing). - DID update keys:
(acme, did, did-update).
The KMS resolves each alias to a specific key inside the configured provider backend. Two tenants with the same alias structure ((acme, issuer, credential-signing) and (beta, issuer, credential-signing)) resolve to two different keys; the KMS multi-tenant isolation is enforced by MultiTenantKmsIsolationTest in CI.
Rotation is per-alias and per-tenant. Rotating acme's credential signing key does not touch beta's.
The data plane never holds the raw key. The alias is the contract; the actual key material lives in the configured provider backend (software keystore, HSM, AWS KMS, Azure Key Vault, Digidentity CSC). Even if a data-plane container is compromised, the attacker has no way to extract the key material; only the signing capability for the duration of the compromise.
See the KMS container page for the full alias contract and provider model.
Encryption At Rest
Several EDK subsystems persist intermediate state that carries sensitive material (the assembled attribute bag in an OID4VCI session, the principal context in an OAuth session, the integration secrets reference). These payloads are encrypted at rest with a per-tenant key derived from the deployment's master KEK.
The SessionEncryptionService interface and the three modes:
PlaintextMode. For development only. Sessions are stored as plain JSON.PlatformEncryptedMode. Production default. Sessions are sealed with AEAD under a tenant-specific KEK derived from the deployment master via HKDF withtenant_idas salt. A compromise of one tenant's KEK does not yield another tenant's plaintext.ClientBoundMode. Additionally binds the encryption to the session'scorrelationId. Sessions are unreadable except when the matching correlation id is presented (typically by the same wallet that initiated the session).
The mode is per-subsystem, configurable per tenant through tenant_config_property. For the issuance pipeline session: oid4vci.session.encryption_mode. For the OAuth session: oauth2.session.encryption_mode. The default is PlatformEncryptedMode.
Authorization Scope
Even with row-level and database-level isolation, the admin REST has to enforce that a tenant administrator can only act on their own tenant. The pattern is consistent:
- The JWT is bound to a tenant (
tenant_idclaim). - The session is descended into that tenant's scope.
- Repository queries filter on the resolved tenant.
- The admin endpoint command compares the resolved tenant against the target tenant identified in the URL (
{tenantId}in the path). - A mismatch is rejected with a 403.
For platform admins acting on a specific tenant's resources, the pattern is "impersonation token", not "magic URL". The platform admin's JWT carries a tenant_id claim equal to the application tenant; to act on acme, the platform admin obtains a short-lived impersonation JWT bound to acme from the application admin REST, and uses that JWT for the subsequent calls. The same authorization scope check applies; the platform admin is just allowed to obtain the impersonation token, where a normal admin is not.
The only path-encoded tenant id that addresses the operation target (rather than the acting tenant) is the /api/v1/tenants/{id}/... path itself, which is for managing the tenant entity. Even there, the path's {id} is checked against the acting tenant from the JWT: only platform admins (acting on the application tenant) can mutate other tenants' entity rows.
Tenant Suspension
SUSPENDED is the explicit "this tenant exists but cannot serve traffic" state. When the resolver detects a suspended tenant, the response depends on the surface:
- Public protocol surfaces return
503 Service Unavailablewith atenant_suspendederror. - Admin REST surfaces bound to the suspended tenant return
403 Forbidden. - The platform admin can still see and modify the suspended tenant from the application tenant.
Suspension is reversible: an UpdateTenantStatusHttpEndpointCommand call from SUSPENDED back to ACTIVE brings the tenant back online, and the cross-replica invalidation event propagates the change to every replica.
Soft delete (deletedAt non-null) is the harder state. The tenant disappears from listings and from resolution. Data remains in storage but is unreachable through the standard APIs. Hard delete is not a standard EDK operation; it requires direct database surgery, and the EDK does not expose a REST endpoint for it.
When Things Go Wrong
A few diagnostic patterns:
- Cross-tenant read attempt. The repository returns an empty result set; the service command treats it as "not found"; the REST returns 404. No information leaks about whether the resource exists under a different tenant.
- Cross-tenant write attempt. The unique constraint at the database level prevents the write; the service command surfaces a 409; the audit event records the attempt.
- Resolver returns the wrong tenant. The downstream authorisation check on the admin command catches the mismatch and refuses with a 403; the protocol command at worst returns the wrong tenant's metadata, which the wallet refuses on signature validation (the JWKS resolves through KMS to a different signing key).
- Per-tenant database unreachable. The routing layer surfaces a 503 for that tenant; other tenants are unaffected. The deployment's monitoring alerts on the database health, not on per-tenant request failures.
- Per-tenant KMS provider unreachable. Signing-dependent paths for that tenant fail immediately; the deployment's monitoring picks it up through the KMS metrics; other tenants on other providers are unaffected.