Skip to main content
Version: v0.25.0 (Latest)

Identity Matching & Reconciliation

Identity matching links external identifiers (holder keys, subject IDs, email addresses, claim tuples) to internal identity IDs using privacy-preserving HMAC hashing. Identity reconciliation is the policy engine that decides what to do when a user shows up, accept an existing link, run identity verification, require step-up, or reject.

Identity Matching

How Linking Works

External identifiers are never stored in plaintext. Instead, three domain-separated keys are used:

KeyAliasAlgorithmPurpose
Key Areconciliation:holderHMAC-SHA256Hash holder key identifiers (wallet public keys)
Key Breconciliation:institutionHMAC-SHA256Hash institution/external identifiers (subject IDs, emails)
Key Creconciliation:encryptionAES-256-GCMEncrypt reversible payloads (canonical claims, institution IDs, auxiliary data)

All keys are managed by the IDK KMS. In production, they should be backed by an HSM or cloud KMS provider. Key aliases are configured via:

identity:
reconciliation:
crypto:
hmac-key-alias: reconciliation-hmac-key
hmac-key-provider-id: software
encryption-key-alias: reconciliation-encryption-key
encryption-key-provider-id: software

The flow for every identifier: the plaintext value is HMAC-hashed with the appropriate domain-separated key, producing a multibase-encoded multihash that gets stored in the IdentityMatch record. Lookups reverse this, hash the incoming identifier with the same key and compare against stored hashes.

Lookups are hash-based: the system hashes the incoming identifier with the same key and compares against stored hashes. The plaintext identifier is never persisted.

IdentityMatch

An IdentityMatch is the core link between an external world and your internal identity model. When a user authenticates through their wallet, the wallet's public key fingerprint gets HMAC-hashed and stored as an IdentityMatch. The next time that wallet shows up, the system hashes the key again, finds the match, and knows immediately which internal identity this is, without calling any external service.

Matches are intentionally minimal, they store the hash, the type, and a reference to the internal identity. They don't store the user's name, email, or any other attributes. That's the job of the IdentityLinkBinding (below), which holds encrypted canonical attributes alongside the match.

data class IdentityMatch(
val id: String,
val identifierHash: String, // HMAC hash of external identifier
val identifierType: IdentifierType,
val internalIdentityId: String, // Reference to internal identity
val tenantId: String,
val metadata: Map<String, String>,
val hashKeyVersion: String?, // For key rotation
val createdAt: Instant,
val updatedAt: Instant?,
val lastUsedAt: Instant?,
)

Identifier Types

A single identity can have multiple matches, one for the wallet's public key, another for their institutional subject ID, another for their email hash. The IdentifierType distinguishes them:

TypeWhat gets hashedWhen it's used
KEYWallet public key fingerprint (JWK thumbprint)Fast-path recognition of returning wallets
DIDDecentralized IdentifierPersistent identifier for rotatable DID methods
EMAILEmail addressMatching by email across providers
SUBJECT_IDExternal subject ID (OIDC sub claim)Linking to institutional identity providers
CLAIM_TUPLEComposite hash of multiple claims (name + DOB + email)Matching without a cryptographic key, when wallet doesn't provide holder binding

The type is a value class, so you can define custom types beyond the built-in set.

IdentityLinkBinding

While IdentityMatch is the index, the fast lookup from external hash to internal identity, the IdentityLinkBinding is the full dossier. It stores the encrypted canonical attributes (name, email, institutional ID) collected during verification, the assurance metadata from the verification that created it, and versioning information for staleness detection.

The binding is what makes the "fast path" work: when a known wallet returns, the system doesn't just know who they are, it can decrypt their cached canonical attributes from the binding and project them into OIDC claims immediately, without contacting any external provider. This is why the binding stores encrypted attributes rather than relying on re-fetching them.

data class IdentityLinkBinding(
val id: String,
val tenantId: String,
val matchId: String, // FK to IdentityMatch

// HMAC hashes for lookup (dual-read for key rotation)
val holderIdentifierHash: String,
val holderHashKeyVersion: String,
val institutionIdentifierHash: String?,
val institutionHashKeyVersion: String?,

// Encrypted reversible data
val encryptedInstitutionId: EncryptedPayload?, // AES-256-GCM
val persistedAttributesEnvelope: PersistedAttributesEnvelope, // Encrypted canonical claims

// Provenance
val providerId: String, // Which reconciliation provider created this
val institutionId: String?,

// Assurance metadata
val assuranceSummary: AssuranceSummary?, // LoA + ACR/AMR from verification

// Version tracking for staleness detection
val canonicalSchemaVersion: String?,
val materialProfileVersion: String?,
val selectorRuleVersion: String?,
val materialFingerprints: Set<String>?,

val createdAt: Instant,
val updatedAt: Instant?,
val lastUsedAt: Instant?,
)

AssuranceSummary

Stored on each binding to track the level of assurance from the verification that established it:

data class AssuranceSummary(
val walletAssuranceLevel: String?, // e.g., "high", "substantial", "low"
val oidcAcr: String?, // Authentication Context Class Reference
val oidcAmr: List<String>?, // Authentication Methods References
val executionId: String?, // IDV execution that produced this
val evidenceReferenceHash: String?, // Hash referencing verification evidence
)

This is what powers step-up decisions: if a binding's walletAssuranceLevel is "low" but the current operation requires "substantial", the reconciliation engine triggers a step-up.

Key Rotation

HMAC keys and encryption keys need to be rotated periodically, compliance requirements, key compromise response, or just good hygiene. The challenge is that rotating an HMAC key changes every hash, which would break all existing lookups.

The ReconciliationCryptoService solves this with dual-read support. During a rotation window, the service computes hashes with both the current and previous key. Lookups try the current key first; if no match is found, they retry with the previous key. When a match is found via the old key, it gets re-hashed with the new key on next write. This means no bulk data migration is needed, bindings migrate lazily as they're accessed.

interface ReconciliationCryptoService {
suspend fun hashHolderKey(holderKey: String): HashedIdentifier
suspend fun hashExternalIdentifier(identifier: String): HashedIdentifier
suspend fun encrypt(plaintext: String): EncryptedPayload
suspend fun decrypt(payload: EncryptedPayload): String

// Dual-read: hash with previous key version for rotation
suspend fun hashHolderKeyWithPrevious(holderKey: String): HashedIdentifier?
suspend fun hashExternalIdentifierWithPrevious(identifier: String): HashedIdentifier?
}

During rotation, lookups try the current key first, then fall back to the previous key. No data migration is required, bindings are re-hashed lazily on next access.

Commands

CommandIDDescription
LookupIdentityMatchCommandidentity.matching.lookupLook up a match by identifier hash and type
CreateIdentityMatchCommandidentity.matching.createCreate a new match record
DeleteIdentityMatchCommandidentity.matching.deleteDelete a match
ListIdentityMatchesCommandidentity.matching.listList all matches for an internal identity

Identity Reconciliation

When a user presents credentials, the system needs to answer: "What do we do with this person?" If they have an existing binding with sufficient assurance, accept them. If they're new, verify their identity. If their binding has expired, ask them to re-verify. If policy says this credential type isn't acceptable, reject them.

These decisions are the domain of the reconciliation engine. Rather than encoding this logic in application code, the engine uses declarative selector rules that match on properties of the incoming request and produce a typed plan. This means the decision logic is configuration, tenants can have different rules, and rules can be updated without code changes.

Reconciliation Plans

The engine produces one of five plans, each representing a distinct course of action. The plans are a sealed interface, so the compiler enforces that all cases are handled.

sealed interface ReconciliationPlan {

data class SkipReconciliation(...)
// Skip reconciliation entirely. Use when wallet credentials
// are sufficient without identity linking.

data class UseExistingBinding(...)
// Accept the existing identity link binding as-is.
// Used when the holder is known and assurance is sufficient.

data class RunIdv(
val providerId: String,
val materialProfileId: String,
val minimumAssurance: String?, // LoA threshold, e.g., "substantial"
val bindingPolicy: BindingPolicy, // REUSE_OR_CREATE, CREATE_NEW, REUSE_ONLY
...
)
// Run a full identity verification workflow via an OIDC provider.
// Creates or updates an identity link binding on success.

data class StepUp(
val providerId: String,
val materialProfileId: String,
...
)
// Upgrade an existing binding to a higher assurance level
// without full re-verification.

data class FailClosed(
val reason: String,
...
)
// Explicitly reject the reconciliation attempt.
}

Selector Rules

Selector rules are the reconciliation engine's decision table. Each rule specifies a set of conditions (which tenant, what kind of credential, whether the holder is known) and a plan to execute when those conditions match. Rules are evaluated in priority order, the first match wins.

The power of this approach is composability. You can have a high-priority rule that fast-tracks known holders, a medium-priority rule that sends unknown holders through institutional OIDC verification, a rule for expired bindings that triggers email verification as a step-up, and a low-priority catch-all that rejects everything else. Each rule is independent and testable.

Rules are evaluated against the incoming context. The first matching rule (by priority) determines the plan:

data class ReconciliationSelectorRule(
val id: String,
val enabled: Boolean = true,
val priority: Int = 0, // Higher = tried first

// Match conditions (null = match all)
val tenants: Set<String>?,
val entryPointTypes: Set<String>?, // WALLET_OID4VP, FEDERATED_OIDC, etc.
val triggerTypes: Set<String>?, // ONBOARDING, STEP_UP, REVALIDATION
val credentialTypes: Set<String>?, // eu.europa.ec.eudi.pid.1, etc.
val issuers: Set<String>?, // Regex patterns against issuer
val knownHolderStates: Set<KnownHolderState>?, // MATCHED_HOLDER_KEY, NOT_FOUND, etc.
val attributePredicates: List<AttributePredicate>?,

// Output
val plan: ReconciliationPlanTemplate,
)

KnownHolderState

Before rule evaluation, the system checks whether the holder already has a binding:

StateMeaning
MATCHED_HOLDER_KEYExisting binding found via cryptographic holder key hash
MATCHED_CLAIM_TUPLEExisting binding found via claim tuple hash
NOT_FOUNDNo existing binding
EXPIRED_BINDINGBinding exists but has expired

This is the key input for step-up decisions: a rule can match on MATCHED_HOLDER_KEY with a condition that the stored assurance is below a threshold, producing a StepUp plan.

Rule Evaluation

object ReconciliationSelector {
fun evaluate(
rules: List<ReconciliationSelectorRule>,
input: ReconciliationSelectorInput,
ruleVersion: String
): ReconciliationPlan?
}

The selector:

  1. Filters enabled rules
  2. Matches each rule against the input (tenant, credential types, issuers, holder state, attribute predicates)
  3. Sorts by priority (descending), then by ID (ascending)
  4. Returns the first match's plan, or null if no rules match

Example Rules

The following example shows a typical rule set for a multi-method deployment. The first rule fast-tracks returning users. The second sends new holders through a full IDV use case that might involve OIDC, document scanning, biometric checks, or email verification, the rule doesn't prescribe which methods; it references an IDV use case that defines the verification graph. The third handles expired bindings with a lighter step-up. The fourth is a catch-all.

[
{
"id": "known-holder-accept",
"priority": 100,
"knownHolderStates": ["MATCHED_HOLDER_KEY"],
"plan": { "decision": "USE_EXISTING_BINDING" }
},
{
"id": "new-holder-idv",
"priority": 50,
"knownHolderStates": ["NOT_FOUND"],
"entryPointTypes": ["WALLET_OID4VP"],
"plan": {
"decision": "RUN_IDV",
"providerId": "onboarding-idv",
"materialProfileId": "standard-onboarding",
"minimumAssurance": "substantial",
"bindingPolicy": "REUSE_OR_CREATE"
}
},
{
"id": "expired-step-up",
"priority": 75,
"knownHolderStates": ["EXPIRED_BINDING"],
"plan": {
"decision": "STEP_UP",
"providerId": "email-reverification",
"materialProfileId": "standard-onboarding"
}
},
{
"id": "fallback-deny",
"priority": 0,
"plan": {
"decision": "FAIL_CLOSED",
"failReason": "No matching reconciliation rule"
}
}
]

The providerId in the RunIdv plan doesn't have to be an OIDC provider, it references whatever IDV use case the deployment has configured. That use case might be a single OIDC login, an email OTP followed by a document scan, a biometric liveness check, or any graph of verification methods. The reconciliation engine is IDV-method-agnostic; it only decides whether to run IDV and with what assurance threshold.

Reconciliation Providers

When reconciliation triggers IDV via the RunIdv or StepUp plan, the actual verification can take many forms. In the simplest case, it's a redirect to an OIDC provider (institutional login). In more complex deployments, it's a multi-step IDV workflow involving document scanning, biometric liveness, email verification, or combinations of these. The ReconciliationProvider configuration describes the OIDC-based path; for richer verification flows, the provider references an IDV use case that defines the full verification graph.

The provider also defines attribute mappings, how claims from the verification source are normalized into canonical form for storage in the encrypted binding:

data class ReconciliationProvider(
val id: String,
val name: String?,
val oidcClientId: String,
val identifierAttributeName: String = "sub",
val enabled: Boolean = true,
val attributeMappings: List<ReconciliationAttributeMapping>,
val userInfoAttributeMappings: List<ReconciliationAttributeMapping>,
val assuranceAcr: String?, // ACR value for this provider
val assuranceAmr: List<String>?, // AMR values for this provider
)

Attribute mappings control how provider claims are normalized into canonical form:

data class ReconciliationAttributeMapping(
val source: String, // Claim from provider, e.g., "eduid"
val target: String, // Canonical name, e.g., "institutional_id"
val identifierType: String?, // e.g., "SUBJECT_ID", "EMAIL"
val required: Boolean = false,
)

Material Profiles

A material profile defines the recipe for constructing an identity link binding, specifically, which identifiers to HMAC-hash for future lookup and which attributes to encrypt for storage. This is separate from the provider configuration because the same verification result can be materialized differently depending on the deployment's requirements.

For example, one deployment might hash both the wallet key and the institutional subject ID (two lookup paths), while another might also add a claim-tuple hash based on name + date of birth (a third lookup path for matching users who switch wallets).

sealed interface ReconciliationMaterial {
data class HolderKeyMaterial(val hmacDomain: String = "holder")
data class ProviderSubjectMaterial(val providerId: String, val hmacDomain: String = "institution")
data class AttributeTupleMaterial(
val attributePaths: List<String>,
val normalizationProfile: String,
val saltRef: String,
val hmacDomain: String,
val minRequiredAttributes: Int,
)
data class CredentialAttributeTupleMaterial(
val credentialQueryId: String?,
val attributePaths: List<String>,
val normalizationProfile: String,
val saltRef: String,
val hmacDomain: String,
val minRequiredAttributes: Int,
)
}

HolderKeyMaterial hashes the wallet's public key. ProviderSubjectMaterial hashes the OIDC provider's subject ID. AttributeTupleMaterial creates a composite hash from multiple attributes (e.g., email + date of birth + name) for matching without a cryptographic key.

Reconciliation Session

The reconciliation flow is managed through a session with its own lifecycle:

data class ReconciliationSession(
val id: String,
val tenantId: String,
val status: ReconciliationSessionStatus, // CREATED → REDIRECTED → CALLBACK_RECEIVED → COMPLETED
val identifierHash: String,
val identifierType: IdentifierType,
val providerId: String,
val authorizationUrl: String?, // OIDC authorization URL
val state: String?, // OIDC state
val nonce: String?, // OIDC nonce
val codeVerifier: String?, // PKCE
val redirectUri: String?,
val tokenEndpoint: String?,
val encryptedIdentity: EncryptedPayload?, // Resolved identity (AES-256-GCM)
val createdAt: Instant,
val expiresAt: Instant, // 10-minute TTL
)

Commands

CommandIDDescription
CreateReconciliationSessionCommandidentity.reconciliation.createStart a reconciliation flow, get OIDC authorization URL
CompleteReconciliationCommandidentity.reconciliation.completeExchange auth code, create/update identity match
GetReconciliationSessionCommandidentity.reconciliation.getFetch session state
CancelReconciliationSessionCommandidentity.reconciliation.cancelCancel a reconciliation session

Persistence

The IDK provides in-memory store implementations for development and testing. For production, the auth bridge service uses PostgreSQL via SQLDelight with the following schema:

TableContentsProtection
identity_matchHMAC-hashed identifier → internal identity mappingsIdentifier never stored in plaintext (HMAC via Key A/B)
identity_link_bindingEncrypted profile linking wallet holder to institutional identityInstitution ID encrypted (AES-256-GCM via Key C), canonical claims encrypted in envelope
reconciliation_sessionOIDC reconciliation flow stateResolved identity encrypted (AES-256-GCM), 10-minute TTL
reconciliation_providerProvider configuration (client ID, attribute mappings, assurance)Long-lived config data
auxiliary_dataIdentity-linked supplemental data (enrollment, grades, etc.)All fields encrypted at rest (AES-256-GCM via Key C)

The PostgreSQL stores use @ContributesBinding to replace the IDK's in-memory defaults at AppScope.

Auxiliary Data

The auxiliary data store allows authorized systems to attach encrypted supplemental data to reconciled identities, enrollment records, grades, programme information, or any category-keyed data that should be linked to the identity without being part of the canonical claims.

data class AuxiliaryDataRecord(
val id: String,
val tenantId: String,
val internalIdentityId: String,
val category: String, // e.g., "enrollment", "grades"
val encryptedPayload: EncryptedPayload, // AES-256-GCM via Key C
val schemaVersion: String = "1",
val createdAt: Instant,
val updatedAt: Instant,
val expiresAt: Instant? = null, // Optional TTL
)

Data is encrypted on write and decrypted on read, callers never see raw ciphertext. The AuxiliaryDataService orchestrates encryption/decryption via ReconciliationCryptoService.

External Reconciliation API

The auth bridge exposes a REST API at /api/external/v1/reconciliation for authorized third-party systems (student information systems, grade registries, credential validators) to access reconciled identity data.

Authentication

All requests require an OAuth2 client_credentials bearer token with scope reconciliation:read. The client ID is matched to a projection configuration that controls which claims and auxiliary categories the client can see.

Per-Client Projection

Each client sees only the data it is authorized for:

external-api:
clients:
krs-module:
client-id: krs-system
scopes: reconciliation:read
projection:
claims: given_name,family_name,email
auxiliary:
names: enrollment,grades
enrollment: enrollment_status,programme_code,cohort
grades: gpa,credits
can-write: true

Endpoints

MethodPathDescription
POST/lookupResolve identity by HMAC-hashed identifier and type
GET/{id}Full projected identity (canonical claims + auxiliary + assurance)
GET/{id}/claimsProjected canonical claims only
GET/{id}/auxiliaryAll projected auxiliary categories
GET/{id}/auxiliary/{category}Single projected auxiliary category
PUT/{id}/auxiliary/{category}Store/update auxiliary data (requires canWrite)
DELETE/{id}/auxiliary/{category}Delete auxiliary category (requires canWrite)
DELETE/{id}GDPR erasure, deletes all matches, bindings, and auxiliary data. Irreversible.

Data is decrypted on-the-fly per request and never cached in plaintext.

Token Enrichment

The TokenEnrichmentService reads canonical claims from the PersistedAttributesEnvelope and auxiliary data, merging them into a claim set that the STS (Security Token Service) uses during token minting. This means ID tokens and access tokens issued by the authorization server can carry reconciled identity claims without the relying party needing to call the external API.