(C-109) Out of Site, Out of Mind? How Tokenized Linkage Activates Patient Observations across Multiple Real-World Data Sources - the Komodo Research Dataset Example
Background: Federated data networks have competitively become a modern solution to sample size challenges in real-world evidence generation. Despite privacy preservation, their distributed nature limits the data capability of following patients beyond individual data sources and specifically in the case of claims databases, across multiple insurances, over time. Data tokenization is an emerging technology that enables advanced linkage to securely address these issues.
Objectives: Using a new-generation claims database as an example, this study examined the impact of tokenized linkage on sample size and longitudinality in epidemiologic evaluations.
Methods: This retrospective cohort study assessed the Komodo Research Dataset, an insurance claims database with pseudonymized linkage across sources and health plans enabled by tokenized personal identifiers. Eligible members contributed claims from payer-complete sources, had ≥1 payer information available, and had ≥1 day enrollment in medical plus drug plans between Jan 2016 through Oct 2024. Distributions of enrollment spans with and without tokenized linkage upon payer change, as well as span extensions at both the person and span levels, were analyzed. The continuous period during which a member had simultaneous medical and drug coverage defined an enrollment span, with gaps ≤45 days allowed.
Results: Of 197,556,385 individuals (as represented by unique patient keys) identified, the mean age was 34.7 years and 52.2% were female on the first day of their most recent payer-segmented enrollment span. After cross-payer linkage, the number of accrued enrollment spans consolidated by 10.6% from 316,099,620 to 282,445,948 and the mean length of these spans increased by 34.9% from 789 to 1064 days. Length of enrollment spans on average increased 554 days per patient from the shortest single-payer span to the longest mixed-payer span, despite 47.2% individuals having no change. Sensitivity analyses suggested 23.9% decrease in span count and 68.0% increase in mean span length when all gaps were bridged. Subgroup analyses for oncology, cardiology, or endocrinology patients showed extension results by 34.2-34.5% in mean span length but from much longer spans of 901 to 1,007 days.
Conclusions: Findings from this study quantified the effect of tokenized linkage on a claims database. Encrypted by design, a tokenized dataset is demonstrated to safely connect periods of patient observations from multiple data streams. Comprising hundreds of millions of observable lives and their elongated time in database up to 1.7 times, the Komodo Research Dataset is verified robust and evidentially offers advantages of scale and continuity among real-world data sources.