Development of Deep Learning Models for Deformable Image Registration (DIR) in the Head and Neck Region

Thumbnail Image




Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Deformable image registration (DIR) is the process of registering two or more images to a reference image by minimizing local differences across the entire image. DIR is conventionally performed using iterative optimization-based methods, which are time-consuming and require manual parameter tuning. Recent studies have shown that deep learning methods, most importantly convolutional neural networks (CNNs), can be employed to address the DIR problem. In this study, we propose two deep learning frameworks to perform the DIR task in an unsupervised approach for CT-to-CT deformable registration of the head and neck region. Given that head and neck cancer patients might undergo severe weight loss over the course of their radiation therapy treatment, DIR in this region becomes an important task. The first proposed deep learning framework contains two scales, where both scales are based on freeform deformation, and are trained based on minimizing a dissimilarity intensity-based metrics, while encouraging the deformed vector field (DVF) smoothness. The two scales were first trained separately in a sequential manner, and then combined in a two-scale joint training framework for further optimization. We then developed a transfer learning technique to improve the DIR accuracy of the proposed deep learning networks by fine-tuning a pre-trained group-based model into a patient-specific model to optimize its performance for individual patients. We showed that by utilizing as few as two prior CT scans of a patient, the performance of the pretrained model described above can be improved yielding more accurate DIR results for individual patients. The second proposed deep learning framework, which also consists of two scales, is a hybrid DIR method using B-spline deformation modeling and deep learning. In the first scale, deformation of control points are learned by deep learning and initial DVF is estimated using B-spline interpolation to ensure smoothness of the initial estimation. Second scale model of the second framework is the same as that in the first framework. In our study, the networks were trained and evaluated using public TCIA HNSCC-3DCT for the head and neck region. We showed that our DIR results of our proposed networks are comparable to conventional DIR methods while being several orders of magnitude faster (about 2 to 3 seconds), making it highly applicable for clinical applications.





Amini, Ala (2020). Development of Deep Learning Models for Deformable Image Registration (DIR) in the Head and Neck Region. Master's thesis, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.