Browsing by Author "Yang, Zhenyu"
- Results Per Page
- Sort Options
Item Embargo A Radiomics-Embedded Vision Transformer for Breast Cancer Ultrasound Image Classification Efficiency Improvement(2024) Zhu, HaimingPurpose: To develop a radiomics-embedded vision transformer (RE-ViT) model by incorporating radiomics features into its architecture, seeking to improve the model's efficiency in medical image recognition towards enhanced breast ultrasound image diagnostic accuracy.Materials and Methods: Following the classic ViT design, the input image was first resampled into multiple 16×16 grid image patches. For each patch, 56-dimensional habitat radiomics features, including intensity-based, Gray Level Co-Occurrence Matrix (GLCOM)-based, and Gray Level Run-Length Matrix (GLRLM)-based features, were extracted. These features were designed to encode local-regional intensity and texture information comprehensively. The extracted features underwent a linear projection to a higher-dimensional space, integrating them with ViT’s standard image embedding process. This integration involved an element-wise addition of the radiomics embedding with ViT’s projection-based and positional embeddings. The resultant combined embeddings were then processed through a Transformer encoder and a Multilayer Perceptron (MLP) head block, adhering to the original ViT architecture. The proposed RE-ViT model was studied using a public BUSI breast ultrasound dataset of 399 patients with benign, malignant, and normal tissue classification. The comparison study includes: (1) RE-ViT versus classic ViT training from scratch, (2) pre-trained RE-ViT versus pre-trained ViT (based on ImageNet-21k), (3) RE-ViT versus VGG-16 CNN model. The model performance was evaluated based on accuracy, ROC AUC, sensitivity, and specificity with 10-fold Monte-Carlo cross validation. Result: The RE-ViT model significantly outperformed the classic ViT model, demonstrating superior overall performance with accuracy = 0.718±0.043, ROC AUC = 0.848±0.033, sensitivity = 0.718±0.059, and specificity = 0.859±0.048. In contrast, the classic ViT model achieved accuracy = 0.473±0.050, ROC AUC = 0.644±0.062, sensitivity = 0.473±0.101, and specificity = 0.737±0.065. Pre-trained versions of RE-ViT also showed enhanced performance (accuracy = 0.864±0.031, ROC AUC = 0.950±0.021, sensitivity = 0.864±0.074, specificity = 0.932±0.036) compared to pre-trained ViT (accuracy = 0.675±0.111, ROC AUC = 0.872±0.086, sensitivity = 0.675±0.129, specificity = 0.838±0.096). Additionally, RE-ViT surpassed VGG-16 CNN results (accuracy = 0.553±0.079, ROC AUC = 0.748±0.080, sensitivity = 0.553±0.112, specificity = 0.777±0.089). Conclusion: The proposed radiomics-embedded ViT was successfully developed for ultrasound-based breast tissue classification. Current results underscore the potential of our approach to advance other transformer-based medical image diagnosis tasks.
Item Open Access Development of a Voxel-Based RadiomicsCalculation Platform for Medical Image Analysis(2020) Yang, ZhenyuPurpose: To develop a novel voxel-based radiomics extraction technique, and to investigate the potential association between spatially-encoded radiomics features of the lungs and pulmonary function.
Methods: We developed a voxel-based radiomics feature extraction platform to generate radiomics filtered images. Specifically, for each voxel in the image, 62 radiomics features were calculated in a rotationally-invariant 3D neighbourhood to capture spatially-encoded information. In general, such an approach results in an image tensor object, i.e., each voxel in the original image is represented by a 62-dimensional radiomics feature vector. Two digital phantoms are then designed to validate the technique's ability to quantify regional image information. To test the technique as a potential pulmonary biomarker, we generated radiomics filtered images for 25 lung CT image and are subsequently evaluated against corresponding Galligas PET images, as the ground truth for pulmonary function, using voxel-wise Spearman correlation (r). The Canonical Correlation Analysis (CCA)-based feature fusion method is also implemented to enhance such a correlation. Finally, the Spearman distributions were compared with 37 individual CT ventilation image (CTVI) algorithms to assess the overall performance relative to conventional CT-based techniques.
Results: Several radiomics filtered images were identified to be correlated with Galligas PET lung imaging. The most robust association was found to be the Run Length Encoding feature, Run-Length Non-uniformity (0.21
Conclusions: This preliminary study indicates that spatially-encoded lung texture and lung density are potentially associated with pulmonary function as measured via Galligas PET ventilation images. Collectively, low density, heterogeneous coarse lung texture was often associated with lower Galligas radiotracer amounts.
Item Embargo Explainable Artificial Intelligence Techniques in Medical Imaging Analysis(2023) Yang, ZhenyuArtificial intelligence (AI), including classic machine learning (ML) and deep learning (DL), has recently made an impact on advanced medical image analysis. Classic ML learns the data representation by manual image feature engineering, namely radiomics, based on experts' domain knowledge. DL directly learns the image feature through hierarchical data modeling directly from the input data. Both classic ML and DL models have emerged as promising AI tools for medical image analysis. Despite promising academic research in which algorithms are beginning to outperform humans, clinical radiography analysis still has limited AI involvement. One issue of current AI development (for both classic ML and DL) is the lack of model explainability, i.e., the extent to which the internal mechanics of an AI model can be explained in human terms from a clinical perspective. The unexplainable issues include, but are not limited to, model confidence ('Can we trust the results with some clues?'), data utilization ('Do we need this as a part of the model?'), and model generalization ('How do I know if it works?'). Without such model explainability, AI models remain a black box in implementation, which leads to a lack of accountability and confidence in clinic application. We hypothesize that the current medical domain knowledge, both in theory and in practice, can be incorporated into AI designs to provide explainability. Therefore, the objective of this dissertation is to explore potential techniques to enhance AI model explainability. Specifically, three novel AI models were developed: • The first model aimed to explore a radiomic filtering model to quantify and visualize radiomic features associated with pulmonary ventilation from lung computed tomography (CT). In this model, lung volume was segmented on 46 CT images, and a 3D sliding window kernel was implemented across the lung volume to capture the spatial-encoded image information. Fifty-three radiomic features were extracted within the kernel, resulting in a 4th-order tensor object. As such, each voxel coordinate of the original lung was represented as a 53-dimensional feature vector, such that radiomic features could be viewed as feature maps within the lungs. To test the technique as a potential pulmonary ventilation biomarker, the radiomic feature maps were compared to paired functional images (Galligas-positron emission tomography, PET or DTPA-single photon emission computed tomography, SPECT) based on Spearman correlation (?) analysis. From the results, the radiomic feature map Gray Level Run Length Matrix (GLRLM)-based Run-Length Non-Uniformity and Gray Level Co-occurrence Matrix (GLCOM)-based Sum Average are found to be highly correlated with functional imaging. The achieved ? (median [range]) for the two features are 0.46 [0.05, 0.67] and 0.45 [0.21, 0.65] across 46 patients and 2 functional imaging modalities, respectively. The results provide evidence that local regions of sparsely encoded heterogeneous lung parenchyma on CT are associated with diminished radiotracer uptake and measured lung ventilation defects on PET/SPECT imaging. Collectively, these findings demonstrate the potential of radiomic filtering to provide a visual explanation of lung CT radiomic features associated with lung ventilation. The developed technique may serve as a complementary tool to the current lung quantification techniques and provide hypothesis-generating data for future studies. • The second model aimed to explore a neural ordinary differential equation (ODE)-based segmentation model to observe deep neural network (DNN) behavior in multi-parametric magnetic resonance imaging (MRI)-based glioma segmentation. In this model, by hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we implemented a novel DL model, neural ODE, in which deep feature extraction was governed by an ODE parameterized by a neural network. The dynamics of 1) MR images after interactions with the DNN and 2) segmentation formation can thus be visualized after solving the ODE. An accumulative contribution curve (ACC) was designed to quantitatively evaluate each MR image’s utilization by the DNN toward the final segmentation results. The proposed neural ODE model was demonstrated using 369 glioma patients with a 4-modality multi-parametric MRI protocol: T1, contrast-enhanced T1 (T1-Ce), T2, and fluid-attenuated inversion recovery (FLAIR). Three neural ODE models were trained to segment enhancing tumor (ET), tumor core (TC), and whole tumor (WT), respectively. The key MR modalities with significant utilization by DNNs were identified based on ACC analysis. Segmentation results by DNNs using only the key MR modalities were compared to the ones using all 4 MR modalities in terms of Dice coefficient, accuracy, sensitivity, and specificity. From the results, all neural ODE models successfully illustrated image dynamics as expected. ACC analysis identified T1-Ce as the only key modality in ET and TC segmentations, while both FLAIR and T2 were key modalities in WT segmentation. Compared to the U-Net results using all 4 MR modalities, the Dice coefficient of ET (0.784→0.775), TC (0.760→0.758), and WT (0.841→0.837). Collectively, the neural ODE model offers a new tool for optimizing the DL model inputs with enhanced explainability in data utilization. The presented methodology can be generalized to other medical image-related DL applications. • The third model aimed to explore a multi-feature-combined (MFC) model to quantify the role of radiomic features, DL image features, and their combination in predicting local failure from pre-treatment CT images of early-stage non-small cell lung cancer (NSCLC) patients after either lung surgery or stereotactic body radiation therapy (SBRT). The MFC model comprised three key steps. (1) Extraction of 92 handcrafted radiomic features from the gross tumor volume (GTV) segmented on pre-treatment CT images. (2) Extraction of 512 deep features from pre-trained DL U-Net encoder structure. Specifically, the 512 latent activation values from the last fully connected layers were studied. (3) The extracted 92 handcrafted radiomic features, 512 deep features, along with 4 patient demographic information (i.e., gender, age, tumor volume, and Charlson comorbidity index), were concatenated as a multi-dimensional input to three classifiers: logistic regression (LR), supporting vector machine (SVM), and random forest (RF) to predict the local failure. Two NSCLC patient cohorts from our institution were investigated: (1) the surgery cohort includes 83 patients who underwent segmentectomy or wedge resection (with 7 local failures), and (2) the SBRT cohort includes 84 patients who received lung SBRT (with 9 local failures). The MFC model was developed and evaluated independently for both patient cohorts. For each cohort, the MFC model was also compared against (1) the R model: LR/SVM/RF prediction models using only radiomic features, (2) the PI model: LR/SVM/RF prediction models using only patient demographic information, and (3) the DL model: DL design that directly predicts the local failure based on the U-Net encoder. All models were tested based on two validation methods: leave-one-out cross-validation (LOOCV) and 100-fold Monte Carlo cross-validation (MCCV) with a 70%-30% train-test ratio. ROC with AUC analysis was adopted as the main evaluator to measure the prediction performance. The student’s t-test was performed to identify the statistically significant differences when applicable. In LOOCV, the AUC range of the proposed MFC model (for three classifiers) was 0.811-0.956 for the surgery patient cohort and 0.913-0.981 for the SBRT cohort, which was higher than the other studied models: the AUC range was 0.356-0.480 (surgery) and 0.295-0.347 (SBRT) for the PI models, 0.388-0.655 (surgery) and 0.648-0.747 (SBRT) for the R models, and 0.816 (surgery) and 0.842 (SBRT) for the DL models. Similar results can be observed in the 100-fold MCCV: the MFC model again showed the highest AUC results (surgery: 0.831-0.841, SBRT: 0.860-0.947), which were significantly higher than the PI models (surgery: 0.464-0.564, SBRT: 0.457-0.519), R models (surgery: 0.546-0.653, SBRT: 0.559-0.667), and DL models (surgery: 0.690, SBRT: 0.773). Collectively, the developed MFC model improves the ability to predict the occurrence of local failure for both surgery and SBRT patient cohorts with enhanced explainability in the role of different feature sources. It may hold the potential to assist clinicians to optimize treatment procedures in the future. In summary, the three developed models provide substantial contributions to enhance the explainability of current classic ML and DL models. The concepts and techniques developed in this dissertation, as well as understandings and inspirations from the key results, provide valuable knowledge for the future development of AI techniques toward wide clinical trust and acceptance.
Item Embargo Quantifying Radiomic Texture Characterization Performance on Image Resampling and Discretization(2024) Sang, WeiweiPurpose: To develop a novel radiomic quantification framework to quantify the impact of image resampling and discretization on radiomic texture characterization performance.
Methods: The study employed 251 CT scans of a Credence Cartridge phantom (consisting of 10 texture materials) with different image acquisition parameters. Each material was segmented using a pre-defined cylindrical mask. Different image pre-processing workflows including 5 resampling methods (no resampling, trilinear, and nearest resampling to both 1mm³ and 5mm³) and 8 discretization methods (fixed bin size of 25,50,75,100 and fixed bin counts of 8,16,32,64) were randomly applied. 75 radiomic texture features (including 24GLCM-based, 16GLRLM-based, 16GLSZM-based, 14GLDM-based, and 5NGTDM-based) were extracted from each material to characterize its textural attributes. Three machine learning models including logistic regression (LR), random forest (RF), and supporting vector machine (SVM) were developed to identify 10 materials based on the extracted features, and grid search was adopted to optimize the model hyperparameters. The model performance was evaluated on 10-class macro-AUC with 5-fold cross-validation.
Results: Three models successfully classified 10 materials with macro-AUC=0.9941±0.0081, 0.9979±0.0040, and 0.9957±0.0067 for LR, RF, and SVM, respectively. Across 8 different discretization methods, an increasing trend in performance can be observed when the original CT was discretized to a larger gray level range: performance improved by 0.0038 with bin sizes decreasing from 100-25, and by 0.0074 with bin counts increasing from 8 to 64. Among 5 resampling methods, resampling CT to an isotropic voxel spacing showed an improved prediction performance (0.9942±0.0075/0.9944±0.0073 for trilinear/nearest resampling to 1mm³ and 5mm³, respectively) over no interpolation (0.9862±0.0228), with minimal performance discrepancies observed among two different interpolation algorithms. In addition, no statistically significant differences were observed across five folds.
Conclusion: The proposed framework successfully quantified the dependence of radiomics texture characterization on image resampling and discretization.