Radiogenomics for Radiation Treatment Assessment of Advanced Lung Cancers
Background: Radiomics describes the study of converting medical images into high-dimensional quantitative features and following analysis for further decision making and genomics focuses on the understanding genomes of individual organisms and characterizations of different genomes. Radiogenomics is a new emerging method that combines both radiomics and genomics together in clinical studies as well as researches the relation of genetic characteristics and radiomic features. It has the potential as a tool for medical treatment assessment in the future. In this study, we used machine learning methods to build two models for treatment assessment: 1) the output is p53 mutation, and the inputs are radiomic features; 2) the output is patient overall survival, and the inputs are radiomic features and p53 mutation. The modelling process was divided into feature selection and classification. Machine learning is a popular area of artificial intelligence that can make machines “learn by itself”. Machine learning algorithm learns from datasets called “training data”, and generates a prediction model from its learning process. The prediction model can then be used to make predictions and decisions from other datasets.
Purpose: 1) To investigate the correlation between p53 mutation and radiomic features in lung cancer, and to detect p53 mutation from radiomic features using different machine learning methods, 2) To investigate the correlation between genomic (p53 mutation), radiomic, radiogenomic features and overall patient survival in lung cancer using machine learning methods.
Material and Methods: The study used 24 patients with advanced lung cancers who had received radiotherapy and chemotherapy. CT was used as medical imaging modality in radiomics study. A radiomics study was then performed which involved three parts: Pre-treatment (Pre-Tx) Radiomics, Post-treatment (Post-Tx) Radiomics, and Delta Radiomics. The pre-Tx radiomic features were calculated from treatment planning CT images, the post-Tx radiomic features were calculated from the follow-up CT images after the radiotherapy, and the delta radiomic features were calculated as the change of radiomic features between cancer treatment. 19 of 24 patients had both pre-Tx and post-Tx CT images. Totally 61 representative radiomic features were extracted from CT images, including Intensity features, Grey Level Co-occurrence Matrix features, Grey Level Run Length Matrix Features, Grey Level Size Zone Matrix features, Neighborhood Grey Level Difference Matrix features, and Morphological features. Feature selection was implemented to avoid feature redundancy. Spearman Correlation analysis and Lasso regression were used for feature selection for p53 mutation detection. Cox regression and lasso regression were used for feature selection for patient survival prediction. Then, several common machine learning based classification methods were used for modelling of p53 mutation detection and patient survival prediction, including linear discriminative analysis, quadratic discriminative analysis, Naïve Bayes, Linear Support Vector Machine, Kernel Support Vector Machine, Bootstrap Aggregating (Bagging), Logistic Regression, and Lasso generalized linear regression. Radiomic models were used for p53 mutation detection in tumor. Radiogenomic models based on combined radiomic features and p53 mutation were used for patient overall survival prediction. To avoid bias, the leave-one-out cross validation method was used for both feature selection and classification. Receiver Operator Characteristic (ROC) Curves were used as an evaluation method for the model, and Area Under Curve (AUC) values were compared for different classification methods.
Results: For p53 mutation detection, the highest AUC of pre-Tx radiomics (24 patients), pre-Tx radiomics (19 patients), and post-Tx radiomics (19 patients) was 0.6993, 0.5606, and 0.6591. For patient survival prediction, the highest AUC of pre-Tx radiomics (24 patients), pre-Tx radiomics (19 patients), post-Tx radiomics (19 patients), and delta radiomics (19 patients) was 0.7045, 0.7125, 0.6063, and 0.8000, and the highest AUC of pre-Tx radiogenomics (24 patients), pre-Tx radiogenomics (19 patients), post-Tx radiogenomics (19 patients), and delta radiogenomics (19 patients) was 0.7500, 0.7375, 0.5857, and 0.9143.
Conclusion: From limited dataset, it might be feasible to detect p53 mutation by both pre-Tx and post-Tx radiomics. Lasso and LSVM has shown the best performance in classification.
For predicting the overall patient survival, different features were selected. This may be related to the limited data available for the study. It may also be related to the different characteristics of pre-Tx, post-Tx and delta radiomics. Intensity and texture features showed high frequency being selected for pre-Tx and delta features, and morphological features showed high frequency for post-Tx radiomics. However, we also found that the combination of delta radiomics and p53 mutation showed a better patient survival prediction than pre-Tx, post-Tx, delta radiomics and p53 mutation alone. The reason might be related to the difference of tumor reaction to radiation due to p53 mutation. KSVM and Bagging showed highest performance compared with other classification methods.
Keyword: Radiogenomics, radiomics, delta radiomics, genomics, p53, lung cancer, radiotherapy.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Masters Theses