Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Use of claims-based machine learning to assign pancreatic cancer stage for enhanced outcomes research in Medicare.

2026·0 Zitationen·Journal of Clinical Oncology

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

676 Background: Administrative claims are a rich source of data on real-world oncology costs, patient journeys, and treatment use, but diagnosis codes on claims are insufficiently detailed to allow for research on differences in outcomes and cost by stage. Utilization of machine learning techniques to assign cancer stage in claims could expand opportunities for population research. This work sought to build and validate a predictive machine-learning algorithm to determine patients’ stage at pancreatic cancer diagnosis using claims data alone. Methods: Incident pancreatic cancer cases diagnosed between 2016-2017 were identified using the SEER-Medicare data. Patients with <1 month of Medicare Parts A/B/D enrollment in 2016-2017, <12 months of A/B/D enrollment prior to diagnosis, cancer-related treatment within one year of index or prior cancer diagnoses were excluded. Claims were flagged for evidence, frequency, and timing of cancer-related surgeries, anti-cancer therapies, radiation therapy, hospice, and death. These flags, as well as demographics, frailty-related diagnoses, and nursing home residence, were included as predictors of pancreatic cancer stage (American Joint Committee on Cancer – AJCC, local (L), regional (R), and distant (D) – LRD). Analysis performed in R Statistical Software (v4.1.2; R Core Team 2021) using predictive multinomial logistic regression (nnet package; Venables and Ripley 2002). Models were trained on 70% of the sample and tested on the remaining 30%. Results: The initial model accuracy was 65.9% [95% CI: 63.1%-68.6%] when predicting SEER-derived AJCC stages 1A, 1B, 2A, 2B, 3, and 4 (n = 1,173). Collapsing the stages into two groups (1/2 vs. 3/4) increased accuracy to 88.5% [95% CI: 86.5%-90.3%]. When using LRD staging, model accuracy reached 78.2% [95% CI: 75.7%-80.5%], peaking at 86.7% [95% CI: 85.6%-88.6%] when collapsed to a binary model (L/R vs. D). Both models performed as well or better than previously published two-tier cancer staging models. Conclusions: Machine-learning algorithms are viable tools for assigning patients’ pancreatic cancer AJCC stages at diagnosis in claims data, enabling broader population research of costs and clinical outcomes by stage. Model precision relies on clear clinical patterns in claims indicating subtle differences in treatment pathways depending on staging. These differences can be challenging to discern for cancers like pancreatic cancer, on which there are fewer early stage patients available in data to train models due to current diagnostic patterns.

Autoren

Institutionen

Themen

Pancreatic and Hepatic Oncology ResearchArtificial Intelligence in Healthcare and EducationMedical Coding and Health Information

Volltext beim Verlag öffnen

Use of claims-based machine learning to assign pancreatic cancer stage for enhanced outcomes research in Medicare.

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen