Identity Matching & Reconciliation
Identity matching links external identifiers (holder keys, subject IDs, email addresses, claim tuples) to internal identity IDs using privacy-preserving HMAC hashing. Identity reconciliation is the policy engine that decides what to do when a user shows up, accept an existing link, run identity verification, require step-up, or reject.
Identity Matching
How Linking Works
External identifiers are never stored in plaintext. Instead, three domain-separated keys are used:
| Key | Alias | Algorithm | Purpose |
|---|---|---|---|
| Key A | reconciliation:holder | HMAC-SHA256 | Hash holder key identifiers (wallet public keys) |
| Key B | reconciliation:institution | HMAC-SHA256 | Hash institution/external identifiers (subject IDs, emails) |
| Key C | reconciliation:encryption | AES-256-GCM | Encrypt reversible payloads (canonical claims, institution IDs, auxiliary data) |
All keys are managed by the IDK KMS. In production, they should be backed by an HSM or cloud KMS provider. Key aliases are configured via:
identity:
reconciliation:
crypto:
hmac-key-alias: reconciliation-hmac-key
hmac-key-provider-id: software
encryption-key-alias: reconciliation-encryption-key
encryption-key-provider-id: software
The flow for every identifier: the plaintext value is HMAC-hashed with the appropriate domain-separated key, producing a multibase-encoded multihash that gets stored in the IdentityMatch record. Lookups reverse this, hash the incoming identifier with the same key and compare against stored hashes.
Lookups are hash-based: the system hashes the incoming identifier with the same key and compares against stored hashes. The plaintext identifier is never persisted.
IdentityMatch
An IdentityMatch is the core link between an external world and your internal identity model. When a user authenticates through their wallet, the wallet's public key fingerprint gets HMAC-hashed and stored as an IdentityMatch. The next time that wallet shows up, the system hashes the key again, finds the match, and knows immediately which internal identity this is, without calling any external service.
Matches are intentionally minimal, they store the hash, the type, and a reference to the internal identity. They don't store the user's name, email, or any other attributes. That's the job of the IdentityLinkBinding (below), which holds encrypted canonical attributes alongside the match.
data class IdentityMatch(
val id: String,
val identifierHash: String, // HMAC hash of external identifier
val identifierType: IdentifierType,
val internalIdentityId: String, // Reference to internal identity
val tenantId: String,
val metadata: Map<String, String>,
val hashKeyVersion: String?, // For key rotation
val createdAt: Instant,
val updatedAt: Instant?,
val lastUsedAt: Instant?,
)
Identifier Types
A single identity can have multiple matches, one for the wallet's public key, another for their institutional subject ID, another for their email hash. The IdentifierType distinguishes them:
| Type | What gets hashed | When it's used |
|---|---|---|
KEY | Wallet public key fingerprint (JWK thumbprint) | Fast-path recognition of returning wallets |
DID | Decentralized Identifier | Persistent identifier for rotatable DID methods |
EMAIL | Email address | Matching by email across providers |
SUBJECT_ID | External subject ID (OIDC sub claim) | Linking to institutional identity providers |
CLAIM_TUPLE | Composite hash of multiple claims (name + DOB + email) | Matching without a cryptographic key, when wallet doesn't provide holder binding |
The type is a value class, so you can define custom types beyond the built-in set.
IdentityLinkBinding
While IdentityMatch is the index, the fast lookup from external hash to internal identity, the IdentityLinkBinding is the full dossier. It stores the encrypted canonical attributes (name, email, institutional ID) collected during verification, the assurance metadata from the verification that created it, and versioning information for staleness detection.
The binding is what makes the "fast path" work: when a known wallet returns, the system doesn't just know who they are, it can decrypt their cached canonical attributes from the binding and project them into OIDC claims immediately, without contacting any external provider. This is why the binding stores encrypted attributes rather than relying on re-fetching them.
data class IdentityLinkBinding(
val id: String,
val tenantId: String,
val matchId: String, // FK to IdentityMatch
// HMAC hashes for lookup (dual-read for key rotation)
val holderIdentifierHash: String,
val holderHashKeyVersion: String,
val institutionIdentifierHash: String?,
val institutionHashKeyVersion: String?,
// Encrypted reversible data
val encryptedInstitutionId: EncryptedPayload?, // AES-256-GCM
val persistedAttributesEnvelope: PersistedAttributesEnvelope, // Encrypted canonical claims
// Provenance
val providerId: String, // Which reconciliation provider created this
val institutionId: String?,
// Assurance metadata
val assuranceSummary: AssuranceSummary?, // LoA + ACR/AMR from verification
// Version tracking for staleness detection
val canonicalSchemaVersion: String?,
val materialProfileVersion: String?,
val selectorRuleVersion: String?,
val materialFingerprints: Set<String>?,
val createdAt: Instant,
val updatedAt: Instant?,
val lastUsedAt: Instant?,
)
AssuranceSummary
Stored on each binding to track the level of assurance from the verification that established it:
data class AssuranceSummary(
val walletAssuranceLevel: String?, // e.g., "high", "substantial", "low"
val oidcAcr: String?, // Authentication Context Class Reference
val oidcAmr: List<String>?, // Authentication Methods References
val executionId: String?, // IDV execution that produced this
val evidenceReferenceHash: String?, // Hash referencing verification evidence
)
This is what powers step-up decisions: if a binding's walletAssuranceLevel is "low" but the current operation requires "substantial", the reconciliation engine triggers a step-up.
Key Rotation
HMAC keys and encryption keys need to be rotated periodically, compliance requirements, key compromise response, or just good hygiene. The challenge is that rotating an HMAC key changes every hash, which would break all existing lookups.
The ReconciliationCryptoService solves this with dual-read support. During a rotation window, the service computes hashes with both the current and previous key. Lookups try the current key first; if no match is found, they retry with the previous key. When a match is found via the old key, it gets re-hashed with the new key on next write. This means no bulk data migration is needed, bindings migrate lazily as they're accessed.
interface ReconciliationCryptoService {
suspend fun hashHolderKey(holderKey: String): HashedIdentifier
suspend fun hashExternalIdentifier(identifier: String): HashedIdentifier
suspend fun encrypt(plaintext: String): EncryptedPayload
suspend fun decrypt(payload: EncryptedPayload): String
// Dual-read: hash with previous key version for rotation
suspend fun hashHolderKeyWithPrevious(holderKey: String): HashedIdentifier?
suspend fun hashExternalIdentifierWithPrevious(identifier: String): HashedIdentifier?
}
During rotation, lookups try the current key first, then fall back to the previous key. No data migration is required, bindings are re-hashed lazily on next access.
Commands
| Command | ID | Description |
|---|---|---|
LookupIdentityMatchCommand | identity.matching.lookup | Look up a match by identifier hash and type |
CreateIdentityMatchCommand | identity.matching.create | Create a new match record |
DeleteIdentityMatchCommand | identity.matching.delete | Delete a match |
ListIdentityMatchesCommand | identity.matching.list | List all matches for an internal identity |
Identity Reconciliation
When a user presents credentials, the system needs to answer: "What do we do with this person?" If they have an existing binding with sufficient assurance, accept them. If they're new, verify their identity. If their binding has expired, ask them to re-verify. If policy says this credential type isn't acceptable, reject them.
These decisions are the domain of the reconciliation engine. Rather than encoding this logic in application code, the engine uses declarative selector rules that match on properties of the incoming request and produce a typed plan. This means the decision logic is configuration, tenants can have different rules, and rules can be updated without code changes.
Reconciliation Plans
The engine produces one of five plans, each representing a distinct course of action. The plans are a sealed interface, so the compiler enforces that all cases are handled.
sealed interface ReconciliationPlan {
data class SkipReconciliation(...)
// Skip reconciliation entirely. Use when wallet credentials
// are sufficient without identity linking.
data class UseExistingBinding(...)
// Accept the existing identity link binding as-is.
// Used when the holder is known and assurance is sufficient.
data class RunIdv(
val providerId: String,
val materialProfileId: String,
val minimumAssurance: String?, // LoA threshold, e.g., "substantial"
val bindingPolicy: BindingPolicy, // REUSE_OR_CREATE, CREATE_NEW, REUSE_ONLY
...
)
// Run a full identity verification workflow via an OIDC provider.
// Creates or updates an identity link binding on success.
data class StepUp(
val providerId: String,
val materialProfileId: String,
...
)
// Upgrade an existing binding to a higher assurance level
// without full re-verification.
data class FailClosed(
val reason: String,
...
)
// Explicitly reject the reconciliation attempt.
}
Selector Rules
Selector rules are the reconciliation engine's decision table. Each rule specifies a set of conditions (which tenant, what kind of credential, whether the holder is known) and a plan to execute when those conditions match. Rules are evaluated in priority order, the first match wins.
The power of this approach is composability. You can have a high-priority rule that fast-tracks known holders, a medium-priority rule that sends unknown holders through institutional OIDC verification, a rule for expired bindings that triggers email verification as a step-up, and a low-priority catch-all that rejects everything else. Each rule is independent and testable.
Rules are evaluated against the incoming context. The first matching rule (by priority) determines the plan:
data class ReconciliationSelectorRule(
val id: String,
val enabled: Boolean = true,
val priority: Int = 0, // Higher = tried first
// Match conditions (null = match all)
val tenants: Set<String>?,
val entryPointTypes: Set<String>?, // WALLET_OID4VP, FEDERATED_OIDC, etc.
val triggerTypes: Set<String>?, // ONBOARDING, STEP_UP, REVALIDATION
val credentialTypes: Set<String>?, // eu.europa.ec.eudi.pid.1, etc.
val issuers: Set<String>?, // Regex patterns against issuer
val knownHolderStates: Set<KnownHolderState>?, // MATCHED_HOLDER_KEY, NOT_FOUND, etc.
val attributePredicates: List<AttributePredicate>?,
// Output
val plan: ReconciliationPlanTemplate,
)
KnownHolderState
Before rule evaluation, the system checks whether the holder already has a binding:
| State | Meaning |
|---|---|
MATCHED_HOLDER_KEY | Existing binding found via cryptographic holder key hash |
MATCHED_CLAIM_TUPLE | Existing binding found via claim tuple hash |
NOT_FOUND | No existing binding |
EXPIRED_BINDING | Binding exists but has expired |
This is the key input for step-up decisions: a rule can match on MATCHED_HOLDER_KEY with a condition that the stored assurance is below a threshold, producing a StepUp plan.
Rule Evaluation
object ReconciliationSelector {
fun evaluate(
rules: List<ReconciliationSelectorRule>,
input: ReconciliationSelectorInput,
ruleVersion: String
): ReconciliationPlan?
}
The selector:
- Filters enabled rules
- Matches each rule against the input (tenant, credential types, issuers, holder state, attribute predicates)
- Sorts by priority (descending), then by ID (ascending)
- Returns the first match's plan, or
nullif no rules match
Example Rules
The following example shows a typical rule set for a multi-method deployment. The first rule fast-tracks returning users. The second sends new holders through a full IDV use case that might involve OIDC, document scanning, biometric checks, or email verification, the rule doesn't prescribe which methods; it references an IDV use case that defines the verification graph. The third handles expired bindings with a lighter step-up. The fourth is a catch-all.
[
{
"id": "known-holder-accept",
"priority": 100,
"knownHolderStates": ["MATCHED_HOLDER_KEY"],
"plan": { "decision": "USE_EXISTING_BINDING" }
},
{
"id": "new-holder-idv",
"priority": 50,
"knownHolderStates": ["NOT_FOUND"],
"entryPointTypes": ["WALLET_OID4VP"],
"plan": {
"decision": "RUN_IDV",
"providerId": "onboarding-idv",
"materialProfileId": "standard-onboarding",
"minimumAssurance": "substantial",
"bindingPolicy": "REUSE_OR_CREATE"
}
},
{
"id": "expired-step-up",
"priority": 75,
"knownHolderStates": ["EXPIRED_BINDING"],
"plan": {
"decision": "STEP_UP",
"providerId": "email-reverification",
"materialProfileId": "standard-onboarding"
}
},
{
"id": "fallback-deny",
"priority": 0,
"plan": {
"decision": "FAIL_CLOSED",
"failReason": "No matching reconciliation rule"
}
}
]
The providerId in the RunIdv plan doesn't have to be an OIDC provider, it references whatever IDV use case the deployment has configured. That use case might be a single OIDC login, an email OTP followed by a document scan, a biometric liveness check, or any graph of verification methods. The reconciliation engine is IDV-method-agnostic; it only decides whether to run IDV and with what assurance threshold.
Reconciliation Providers
When reconciliation triggers IDV via the RunIdv or StepUp plan, the actual verification can take many forms. In the simplest case, it's a redirect to an OIDC provider (institutional login). In more complex deployments, it's a multi-step IDV workflow involving document scanning, biometric liveness, email verification, or combinations of these. The ReconciliationProvider configuration describes the OIDC-based path; for richer verification flows, the provider references an IDV use case that defines the full verification graph.
The provider also defines attribute mappings, how claims from the verification source are normalized into canonical form for storage in the encrypted binding:
data class ReconciliationProvider(
val id: String,
val name: String?,
val oidcClientId: String,
val identifierAttributeName: String = "sub",
val enabled: Boolean = true,
val attributeMappings: List<ReconciliationAttributeMapping>,
val userInfoAttributeMappings: List<ReconciliationAttributeMapping>,
val assuranceAcr: String?, // ACR value for this provider
val assuranceAmr: List<String>?, // AMR values for this provider
)
Attribute mappings control how provider claims are normalized into canonical form:
data class ReconciliationAttributeMapping(
val source: String, // Claim from provider, e.g., "eduid"
val target: String, // Canonical name, e.g., "institutional_id"
val identifierType: String?, // e.g., "SUBJECT_ID", "EMAIL"
val required: Boolean = false,
)
Material Profiles
A material profile defines the recipe for constructing an identity link binding, specifically, which identifiers to HMAC-hash for future lookup and which attributes to encrypt for storage. This is separate from the provider configuration because the same verification result can be materialized differently depending on the deployment's requirements.
For example, one deployment might hash both the wallet key and the institutional subject ID (two lookup paths), while another might also add a claim-tuple hash based on name + date of birth (a third lookup path for matching users who switch wallets).
sealed interface ReconciliationMaterial {
data class HolderKeyMaterial(val hmacDomain: String = "holder")
data class ProviderSubjectMaterial(val providerId: String, val hmacDomain: String = "institution")
data class AttributeTupleMaterial(
val attributePaths: List<String>,
val normalizationProfile: String,
val saltRef: String,
val hmacDomain: String,
val minRequiredAttributes: Int,
)
data class CredentialAttributeTupleMaterial(
val credentialQueryId: String?,
val attributePaths: List<String>,
val normalizationProfile: String,
val saltRef: String,
val hmacDomain: String,
val minRequiredAttributes: Int,
)
}
HolderKeyMaterial hashes the wallet's public key. ProviderSubjectMaterial hashes the OIDC provider's subject ID. AttributeTupleMaterial creates a composite hash from multiple attributes (e.g., email + date of birth + name) for matching without a cryptographic key.
Reconciliation Session
The reconciliation flow is managed through a session with its own lifecycle:
data class ReconciliationSession(
val id: String,
val tenantId: String,
val status: ReconciliationSessionStatus, // CREATED → REDIRECTED → CALLBACK_RECEIVED → COMPLETED
val identifierHash: String,
val identifierType: IdentifierType,
val providerId: String,
val authorizationUrl: String?, // OIDC authorization URL
val state: String?, // OIDC state
val nonce: String?, // OIDC nonce
val codeVerifier: String?, // PKCE
val redirectUri: String?,
val tokenEndpoint: String?,
val encryptedIdentity: EncryptedPayload?, // Resolved identity (AES-256-GCM)
val createdAt: Instant,
val expiresAt: Instant, // 10-minute TTL
)
Commands
| Command | ID | Description |
|---|---|---|
CreateReconciliationSessionCommand | identity.reconciliation.create | Start a reconciliation flow, get OIDC authorization URL |
CompleteReconciliationCommand | identity.reconciliation.complete | Exchange auth code, create/update identity match |
GetReconciliationSessionCommand | identity.reconciliation.get | Fetch session state |
CancelReconciliationSessionCommand | identity.reconciliation.cancel | Cancel a reconciliation session |
Persistence
The IDK provides in-memory store implementations for development and testing. For production, the auth bridge service uses PostgreSQL via SQLDelight with the following schema:
| Table | Contents | Protection |
|---|---|---|
identity_match | HMAC-hashed identifier → internal identity mappings | Identifier never stored in plaintext (HMAC via Key A/B) |
identity_link_binding | Encrypted profile linking wallet holder to institutional identity | Institution ID encrypted (AES-256-GCM via Key C), canonical claims encrypted in envelope |
reconciliation_session | OIDC reconciliation flow state | Resolved identity encrypted (AES-256-GCM), 10-minute TTL |
reconciliation_provider | Provider configuration (client ID, attribute mappings, assurance) | Long-lived config data |
auxiliary_data | Identity-linked supplemental data (enrollment, grades, etc.) | All fields encrypted at rest (AES-256-GCM via Key C) |
The PostgreSQL stores use @ContributesBinding to replace the IDK's in-memory defaults at AppScope.
Auxiliary Data
The auxiliary data store allows authorized systems to attach encrypted supplemental data to reconciled identities, enrollment records, grades, programme information, or any category-keyed data that should be linked to the identity without being part of the canonical claims.
data class AuxiliaryDataRecord(
val id: String,
val tenantId: String,
val internalIdentityId: String,
val category: String, // e.g., "enrollment", "grades"
val encryptedPayload: EncryptedPayload, // AES-256-GCM via Key C
val schemaVersion: String = "1",
val createdAt: Instant,
val updatedAt: Instant,
val expiresAt: Instant? = null, // Optional TTL
)
Data is encrypted on write and decrypted on read, callers never see raw ciphertext. The AuxiliaryDataService orchestrates encryption/decryption via ReconciliationCryptoService.
External Reconciliation API
The auth bridge exposes a REST API at /api/external/v1/reconciliation for authorized third-party systems (student information systems, grade registries, credential validators) to access reconciled identity data.
Authentication
All requests require an OAuth2 client_credentials bearer token with scope reconciliation:read. The client ID is matched to a projection configuration that controls which claims and auxiliary categories the client can see.
Per-Client Projection
Each client sees only the data it is authorized for:
external-api:
clients:
krs-module:
client-id: krs-system
scopes: reconciliation:read
projection:
claims: given_name,family_name,email
auxiliary:
names: enrollment,grades
enrollment: enrollment_status,programme_code,cohort
grades: gpa,credits
can-write: true
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /lookup | Resolve identity by HMAC-hashed identifier and type |
GET | /{id} | Full projected identity (canonical claims + auxiliary + assurance) |
GET | /{id}/claims | Projected canonical claims only |
GET | /{id}/auxiliary | All projected auxiliary categories |
GET | /{id}/auxiliary/{category} | Single projected auxiliary category |
PUT | /{id}/auxiliary/{category} | Store/update auxiliary data (requires canWrite) |
DELETE | /{id}/auxiliary/{category} | Delete auxiliary category (requires canWrite) |
DELETE | /{id} | GDPR erasure, deletes all matches, bindings, and auxiliary data. Irreversible. |
Data is decrypted on-the-fly per request and never cached in plaintext.
Token Enrichment
The TokenEnrichmentService reads canonical claims from the PersistedAttributesEnvelope and auxiliary data, merging them into a claim set that the STS (Security Token Service) uses during token minting. This means ID tokens and access tokens issued by the authorization server can carry reconciled identity claims without the relying party needing to call the external API.