Plasma Cell precursor and Other Disorders

Category: Plasma Cell precursor and Other Disorders

CALM: A Deep Learning Time Series Model for Predicting Progression from Smoldering Multiple Myeloma Using Clinical Data

(PA-366) CALM: A Deep Learning Time Series Model for Predicting Progression from Smoldering Multiple Myeloma Using Clinical Data

Thursday, September 18, 2025

Anish K. Simhal, PhD (he/him/his)

Postdoctoral Fellow
Memorial Sloan Kettering Cancer Center

Introduction:

Current methods of identifying smoldering multiple myeloma (SMM) patients at high-risk (HR) of progression rely primarily on blood- and bone marrow-based assessments of tumor burden, and do not fully leverage the dynamic, longitudinal data routinely generated in clinical practice as SMM patients are monitored. We hypothesized that standard clinical data may contain hidden features that could potentially identify patients at high risk of progression. To test this hypothesis, we developed CALM (Cancer AI Longitudinal Modeling), a ‘digital twin’ deep learning model designed to predict the risk of progression from SMM to MM using serial laboratory measurements, as opposed to values available at diagnosis alone.

Methods:

We retrospectively analyzed clinical data from 438 patients diagnosed with SMM between 2002 and 2019, with a median follow-up of 4.24 years and a median of 11 laboratory assessments per patient (median interval between measurements: 92 days). At last follow-up, 185 patients (42.2%) progressed to MM, while 253 (57.8%) were censored. CALM utilizes time series measurements of key biomarkers — serum M-protein (M-spike), involved and uninvolved free light chains (FLC), and derived features such as the FLC ratio and log-transformed FLC ratio, along with additional routine laboratory values. CALM leverages a padded Long Short-Term Memory (LSTM) neural network model specifically designed to recognize patterns and trends across sequential data. The LSTM processes each patient’s longitudinal data as a sequence, learning to recognize temporal changes that may signal impending progression. Model evaluation was performed using a Cox proportional hazards loss function with 5-fold cross-validation.

Results:

Risk scores computed using CALM achieved a mean concordance index (C-index) of 0.833 ± 0.018, demonstrating strong discriminatory performance for predicting progression to MM with data associated with the last sequence. This represents an 11% increase in predictive potential when compared to baseline risk assessment using the Mayo 2/20/20 criteria alone in the same cohort. To interpret model predictions, we applied Captum’s integrated gradients algorithm to quantify the contribution of each feature at each time point to the predicted risk for an individual patient. Higher values of absolute M-spike and the log-transformed FLC ratio were strongly associated with increased risk of progression, with the model showing a clear preference for the log FLC ratio over the raw ratio as a predictive variable.

Conclusions:

CALM leverages routinely collected serial laboratory data to dynamically update individualized risk predictions for SMM patients. This ‘digital twin’ approach not only provides improved risk stratification over static models but also offers clinicians interpretable feedback on which laboratory trends are driving risk, potentially informing more personalized monitoring and therapeutic decision-making in clinical practice.