A Radiomics Machine Learning Model for Post-Radiotherapy Overall Survival Prediction of Non-Small Cell Lung Cancer (NSCLC)

Limited Access
This item is unavailable until:



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Purpose: To predict post-radiotherapy overall survival group of NSCLC patients based on clinical information and radiomics analysis of simulation CT. Materials/Methods: A total of 258 non-adenocarcinoma patients who received radical radiotherapy or chemo-radiation were studied: 45/50/163 patients were identified as short(0-6mos)/mid(6-12mos)/long(12+mos) survival groups, respectively. For each patient, we first extracted 76 radiomics features within the gross tumor volume(GTV) identified in the simulation CT; these features were combined with patient clinical information (age, overall stage, and GTV volume) as a patient-specific feature vector, which was utilized by a 2-step machine learning model for survival group prediction. This model first identifies patients with long survival prediction via a supervised binary classifier; for those with otherwise prediction, a 2nd classifier further generates short/mid survival prediction. Two machine learning classifiers, explainable boosting machine(EBM) and balanced random forest(BRF), were interrogated as a comparison study. During the model training, all patients were divided into training/test sets by an 8:2 ratio, and 100-fold random sampling were applied to the training set with a 7:1 validation ratio. Model performances were evaluated by the sensitivity, accuracy, and ROC results. Results: The model with EBM demonstrated an overall ROC AUC (0.58±0.04) with limited sensitivities in short (0.02±0.04) and mid group (0.11±0.08) predictions due to imbalanced data sample distribution. In contrast, the model with BRF improved short/mid group sensitivities to 0.32±0.11/0.29±0.16, respectively, but the improvement of ROC AUC (0.60±0.04) is limited. Nevertheless, both EBM (0.46±0.04) and BRF (0.57±0.04) approaches achieved limited overall accuracy; a noticeable overlap was found in their feature lists with top 10 feature weight rankings. Conclusion: The proposed two-step machine learning model with BRF classifier possesses a better performance than the one with EBM classifier in the post-radiotherapy survival group prediction of NSCLC. Future works, preferably in the joint use of deep learning, are in demand to further improve the prediction results.





Zhang, Rihui (2023). A Radiomics Machine Learning Model for Post-Radiotherapy Overall Survival Prediction of Non-Small Cell Lung Cancer (NSCLC). Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/29091.


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.