Skip to main content
Version: v0.25.0 (Latest)

Tenant Resolution

Every request that lands on an EDK data plane is resolved to a tenant before the DI graph executes anything tenant-scoped. The resolver chain runs as a Ktor plugin (TenantResolutionPlugin) installed before the DI plugin. Its output is a TenantInput and the corresponding tenant_id, both attached to the call. After resolution, the session is descended into the tenant scope and every SessionScope binding sees the right tenant.

The chain is layered so each surface (subdomain, custom domain, path slug, JWT) can resolve independently and the answers compose predictably when more than one signal is present.

EDK tenant resolution stack

The Four Layers

Resolution runs in priority order. The first resolver that produces a tenant wins.

Layer 1: JWT

For any authenticated call (admin REST, OID4VCI /credential, OID4VP /direct_post with a wallet token, AS-issued tokens carrying the tenant claim), the bearer JWT's tenant_id claim is authoritative. The JWT validator extracts it, the resolver returns the corresponding tenant id immediately, and no other resolver runs. The JWT is the only signal that proves the caller is acting on behalf of the tenant; the other three layers are routing inputs that say where the request landed, not who the caller is.

The X-Tenant-Id header is informational only. The resolver chain does not consult it for tenant determination; logs and traces may include it for diagnostics but it does not influence routing or authorization.

Layer 2: Custom Domain

CustomDomainTenantResolver looks up the incoming Host header in the tenant_domain table and returns the tenant id of the matching verified CUSTOM_DOMAIN row. Unverified rows are skipped by design: a customer can register the row in advance while the DNS verification challenge completes, without the resolver returning the tenant on a host the customer does not yet control.

Custom domain takes priority over platform subdomain so a customer who, for whatever reason, points a custom domain at a host that also matches a platform subdomain still routes through the verified custom domain.

Layer 3: Platform Subdomain

PlatformSubdomainTenantResolver parses the host as <labels>.<platform-base> where <platform-base> is the configured tenant.resolution.platform_base_host. The label immediately to the left of the platform base is treated as the tenant slug.

Two shapes resolve:

  • <tenant>.<platform-base>. The simplest case. acme.saas.com resolves to the acme tenant.
  • <service>.<tenant>.<platform-base>. Service-specific labels (issuer, verifier, auth, did) can sit to the left of the tenant slug without changing resolution. issuer.acme.saas.com resolves to acme.

The resolver consults the cache before hitting the routing repository. Negative results (unknown slug, suspended tenant) are also cached, so a flood of unknown-slug subdomain requests does not keep hitting Postgres.

Layer 4: Path Slug

For the protocol surfaces that include the tenant slug in the URL path (the /{tenant}/oid4vci/..., /{tenant}/oid4vp/..., /{tenant}/.well-known/openid-configuration forms), the path-slug resolver peels the leading segment and treats it as the tenant slug. The HTTP adapter declares the TenantPathPolicy it supports (None, LeadingSlug(maxDepth), or WellKnownSuffix); the dispatcher consults the policy when matching the route.

TenantPathPolicy.LeadingSlug(maxDepth = 1) is the standard form for OID4VP and OID4VCI protocol adapters. TenantPathPolicy.WellKnownSuffix is the form for the spec-canonical .well-known/openid-credential-issuer/{tenant} URLs. TenantPathPolicy.None is the form for admin adapters where the {tenantId} in the path is the operation target rather than the acting tenant.

Well-Known URL Forms

The metadata endpoints have to support multiple URL forms because clients in the field use both the spec form and the legacy slug-before form:

  • OID4VCI openid-credential-issuer metadata. Spec form: /.well-known/openid-credential-issuer/{tenant-slug}. Legacy form: /{tenant-slug}/.well-known/openid-credential-issuer. Both routed through TenantPathPolicy.WellKnownSuffix and TenantPathPolicy.LeadingSlug respectively, both pointing at the same tenant-aware adapter.
  • OAuth AS metadata (oauth-authorization-server). Spec form: /.well-known/oauth-authorization-server/{tenant-slug}. Legacy form available. RFC 8414 insertion handled by the EDK.
  • OIDC Discovery (openid-configuration). Spec form: /{tenant-slug}/.well-known/openid-configuration (slug-before is the OIDC spec form; no slug-after form is expected by clients).

The advertised URLs themselves (the credential_issuer in OID4VCI metadata, the issuer in OAuth metadata, the request_uri_base in an OID4VP authorization request) are taken from the tenant_public_endpoint binding, not from the request host. The resolver determines which tenant the request belongs to; the public-endpoint binding determines what URLs that tenant advertises. See Domains and Public Endpoints.

Resolution Inside the Tenant Hierarchy

For a child tenant, the resolver returns the child's id directly when the request matches the child's slug or domain. There is no automatic walk-up to the parent: a request to acme-nl.saas.com resolves to acme-nl, never to acme, regardless of the parent / child relationship.

When the resolver chain returns no tenant (the host does not match any registered domain or subdomain, and the path does not carry a slug, and the JWT does not carry a tenant claim), the EDK behaviour depends on the adapter:

  • Public protocol adapters configured with TenantPathPolicy.None and bound to system-wide endpoints (the .well-known/oauth-authorization-server root form for the application tenant, for example) treat the absence of a tenant as the application tenant.
  • Other public protocol adapters refuse the call with a 400.
  • Admin REST adapters refuse the call with a 401 (the JWT is missing or invalid).

There is no implicit fallback to a default tenant. A request that fails to resolve is rejected explicitly.

The Cache

InMemoryTenantResolverCache sits in front of the routing repository. Two purposes:

  • Performance. A subdomain or path-slug lookup is a single Postgres query against tenant_routing. Caching the result removes the query from the hot path entirely.
  • Negative caching. Unknown slugs and suspended tenants are cached as well, with the same TTL. Without negative caching, an attacker enumerating slugs by probing subdomains would generate one Postgres query per probe.

The cache TTL is configurable through tenant.resolution.cache_ttl_seconds. The default is conservative because the primary invalidation channel is the cross-replica event bus rather than the TTL.

Cross-Replica Invalidation

With multiple replicas of a data-plane container behind a load balancer, every replica holds its own InMemoryTenantResolverCache. A tenant routing mutation on replica A (a new tenant, a status change, a slug rename, a domain verification) must reach replica B's cache without a restart.

The mechanism is the shared event subsystem:

  1. The admin command (UpdateTenantStatusHttpEndpointCommand, MarkDomainVerifiedHttpEndpointCommand, the slug rename command, and so on) emits a TenantInvalidationBroker event after the database write commits.
  2. A Postgres LISTEN/NOTIFY bridge fans the event out to every replica subscribed to the channel.
  3. Each replica's cache invalidates the affected tenant entry (or the affected slug, depending on the event type).
  4. The next resolution for that tenant or slug hits Postgres and repopulates the cache.

The TTL fallback is the safety net. If a notification is missed (a network blip, a Postgres listener disconnection that takes a moment to reconnect), the TTL eventually expires the stale cache entry. Setting the TTL too aggressively makes the cache useless; setting it too loose makes a missed notification visible for longer. The default balances those.

Putting Settings In One Place

TenantResolutionSettings carries the four knobs the resolver chain reads:

  • platform_base_host. The host suffix that subdomain resolution treats as the platform base. Subdomains of this resolve to tenant slugs. Required when platform subdomain resolution is enabled.
  • platform_subdomain_enabled. Toggles platform-subdomain resolution on or off. Typically on; toggled off in deployments that use only custom domains.
  • trusted_proxy_hop_count. How many X-Forwarded-Host hops the resolver trusts. Important behind reverse proxies and CDNs.
  • cache_ttl_seconds. The fallback TTL on the cache.

The TenantResolutionSettingsBinder reads these from the standard property resolver chain (the app-scope, not per-tenant, because tenant resolution runs before any tenant is in scope). The settings are bound at server startup and feed into the TenantResolutionPlugin.

Common Mistakes to Avoid

A few things people get wrong on first encounter:

  • Trusting X-Tenant-Id. The header is informational only. Treating it as authoritative lets any caller act as any tenant they name. The JWT claim is the authority.
  • Falling back to the request host for advertised URLs. When the resolver returns a tenant, the data plane consults tenant_public_endpoint for the URLs it advertises, not the host the request happened to come in on. The fail-closed default refuses to advertise anything when the binding is missing rather than falling back. Falling back to the request host produces metadata the wallet cannot actually reach when the deployment sits behind a CDN or a custom domain.
  • Mixing up resolver order. Custom domain runs before platform subdomain runs before path slug. A request that matches multiple layers resolves through the highest-priority layer. The JWT layer runs first regardless when present.
  • Trying to resolve with Host set to the cluster's internal hostname. Internal calls from another data-plane container (the issuer calling the AS, for example) should carry the same host the wallet would see, not the cluster's own host. The peer auth layer enforces this.