Quantifying Image Patch Similarity Using Handcrafted and Deep Radiomic Features

Loading...

Date

2025

Journal Title

Journal ISSN

Volume Title

Abstract

The clinical adoption of auto-segmentation tools has been hindered by the poor generalizability and interpretability of AI models, highlighting the need for automated contour quality assurance (QA) systems to verify the accuracy of auto-generated contours. A novel QA workflow has been proposed based on content-based image retrieval (CBIR), wherein the core idea is to retrieve image patches from a curated reference database that are visually similar to query patches sampled from a new scan, allowing the auto-contour segments in the query patches to be evaluated against the manual contours in the retrieved references.

To support this approach, we investigated the feasibility of quantifying visual similarity between 3D image patches using both handcrafted and deep radiomic features. Similarity prediction was formulated as a regression task, with the spatial distance between patch centers serving as a surrogate similarity label. A total of 90,000 patch pairs were sampled from 100 CT scans in the TotalSegmentator dataset. For each pair, 112 handcrafted radiomic features were extracted following the Image Biomarker Standardisation Initiative guidelines. These included 18 intensity-based features and 94 texture descriptors derived from GLCM, GLRLM, GLSZM, GLDZM, NGTDM, and NGLDM matrices. Deep radiomic features were extracted using a custom 3D convolutional neural network embedded in a Siamese architecture, trained end-to-end to predict spatial distances.

We first evaluated individual handcrafted features using univariate statistical analyses, including Spearman's rank correlation and divergence metrics, to assess their ability to distinguish visually similar versus dissimilar patches. Multivariate models, namely random forest and XGBoost regressors, were then trained on the full feature set. To identify the most informative features, we further applied selection techniques including Spearman correlation, permutation importance, Lasso regression, and principal component analysis. Among the regression models, the XGBoost regressor trained on handcrafted features achieved the best predictive performance, with a mean squared error (MSE) of 3.23 voxels and an r2 value of 0.800. In comparison, the Siamese network that extracted and exploited deep features achieved an MSE of 5.14 voxels and r2 of 0.693 after applying an outlier rejection strategy to improve prediction consistency under small spatial perturbations.

Our findings demonstrate that handcrafted radiomic features, when combined with machine learning models, offer an effective approach for modeling visual similarity in medical images. Although deep radiomic features currently yield lower performance, they remain promising due to their capacity for task-specific representation learning and the potential for further architectural refinement. Overall, this work supports the feasibility of developing a radiomics-driven CBIR framework for interpretable and localized contour QA in radiotherapy. Future work will focus on incorporating additional feature types and anatomical priors, refining similarity metrics beyond spatial distance, and validating the system on datasets with simulated or expert-annotated contouring errors to assess its clinical utility.

Description

Provenance

Subjects

Physics

Citation

Citation

Qin, Chenlu (2025). Quantifying Image Patch Similarity Using Handcrafted and Deep Radiomic Features. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/32956.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.