Browsing by Subject "cs.CV"

Now showing 1 - 15 of 15

Open Access
A Deep-Learning Algorithm for Thyroid Malignancy Prediction From Whole Slide Cytopathology Images
Dov, David; Kovalsky, Shahar Z; Assaad, Serge; Cohen, Jonathan; Range, Danielle Elliott; Pendse, Avani A; Henao, Ricardo; Carin, Lawrence
We consider thyroid-malignancy prediction from ultra-high-resolution whole-slide cytopathology images. We propose a deep-learning-based algorithm that is inspired by the way a cytopathologist diagnoses the slides. The algorithm identifies diagnostically relevant image regions and assigns them local malignancy scores, that in turn are incorporated into a global malignancy prediction. We discuss the relation of our deep-learning-based approach to multiple-instance learning (MIL) and describe how it deviates from classical MIL methods by the use of a supervised procedure to extract relevant regions from the whole-slide. The analysis of our algorithm further reveals a close relation to hypothesis testing, which, along with unique characteristics of thyroid cytopathology, allows us to devise an improved training strategy. We further propose an ordinal regression framework for the simultaneous prediction of thyroid malignancy and an ordered diagnostic score acting as a regularizer, which further improves the predictions of the network. Experimental results demonstrate that the proposed algorithm outperforms several competing methods, achieving performance comparable to human experts.
Open Access
Computer vision tools for the non-invasive assessment of autism-related behavioral markers
Hashemi, Jordan; Spina, Thiago Vallin; Tepper, Mariano; Esler, Amy; Morellas, Vassilios; Papanikolopoulos, Nikolaos; Sapiro, Guillermo
The early detection of developmental disorders is key to child outcome, allowing interventions to be initiated that promote development and improve prognosis. Research on autism spectrum disorder (ASD) suggests behavioral markers can be observed late in the first year of life. Many of these studies involved extensive frame-by-frame video observation and analysis of a child's natural behavior. Although non-intrusive, these methods are extremely time-intensive and require a high level of observer training; thus, they are impractical for clinical and large population research purposes. Diagnostic measures for ASD are available for infants but are only accurate when used by specialists experienced in early diagnosis. This work is a first milestone in a long-term multidisciplinary project that aims at helping clinicians and general practitioners accomplish this early detection/measurement task automatically. We focus on providing computer vision tools to measure and identify ASD behavioral markers based on components of the Autism Observation Scale for Infants (AOSI). In particular, we develop algorithms to measure three critical AOSI activities that assess visual attention. We augment these AOSI activities with an additional test that analyzes asymmetrical patterns in unsupported gait. The first set of algorithms involves assessing head motion by tracking facial features, while the gait analysis relies on joint foreground segmentation and 2D body pose estimation in video. We show results that provide insightful knowledge to augment the clinician's behavioral observations obtained from real in-clinic assessments.
Open Access
DCFNet: Deep Neural Network with Decomposed Convolutional Filters
(35th International Conference on Machine Learning, ICML 2018, 2018-01-01) Qiu, Q; Cheng, X; Calderbank, R; Sapiro, G
©35th International Conference on Machine Learning, ICML 2018.All Rights Reserved. Filters in a Convolutional Neural Network (CNN) contain model parameters learned from enormous amounts of data. In this paper, we suggest to decompose convolutional filters in CNN as a truncated expansion with pre-fixed bases, namely the Decomposed Convolutional Filters network (DCFNet), where the expansion coefficients remain learned from data. Such a structure not only reduces the number of trainable parameters and computation, but also imposes filter regularity by bases truncation. Through extensive experiments, we consistently observe that DCFNet maintains accuracy for image classification tasks with a significant reduction of model parameters, particularly with Fourier-Bessel (FB) bases, and even with random bases. Theoretically, we analyze the representation stability of DCFNet with respect to input variations, and prove representation stability under generic assumptions on the expansion coefficients. The analysis is consistent with the empirical observations.
Open Access
Geometric Cross-Modal Comparison of Heterogeneous Sensor Data
(Proceedings of the 39th IEEE Aerospace Conference, 2018-03) Tralie, CJ; Smith, A; Borggren, N; Hineman, J; Bendich, P; Zulch, P; Harer, J
In this work, we address the problem of cross-modal comparison of aerial data streams. A variety of simulated automobile trajectories are sensed using two different modalities: full-motion video, and radio-frequency (RF) signals received by detectors at various locations. The information represented by the two modalities is compared using self-similarity matrices (SSMs) corresponding to time-ordered point clouds in feature spaces of each of these data sources; we note that these feature spaces can be of entirely different scale and dimensionality. Several metrics for comparing SSMs are explored, including a cutting-edge time-warping technique that can simultaneously handle local time warping and partial matches, while also controlling for the change in geometry between feature spaces of the two modalities. We note that this technique is quite general, and does not depend on the choice of modalities. In this particular setting, we demonstrate that the cross-modal distance between SSMs corresponding to the same trajectory type is smaller than the cross-modal distance between SSMs corresponding to distinct trajectory types, and we formalize this observation via precision-recall metrics in experiments. Finally, we comment on promising implications of these ideas for future integration into multiple-hypothesis tracking systems.
Open Access
Imaging dynamics beneath turbid media via parallelized single-photon detection
(CoRR, 2021-07-03) Xu, Shiqi; Yang, Xi; Liu, Wenhui; Jonsson, Joakim; Qian, Ruobing; Konda, Pavan Chandra; Zhou, Kevin C; Kreiss, Lucas; Dai, Qionghai; Wang, Haoqian; Berrocal, Edouard; Horstmeyer, Roarke
Noninvasive optical imaging through dynamic scattering media has numerous important biomedical applications but still remains a challenging task. While standard diffuse imaging methods measure optical absorption or fluorescent emission, it is also well-established that the temporal correlation of scattered coherent light diffuses through tissue much like optical intensity. Few works to date, however, have aimed to experimentally measure and process such temporal correlation data to demonstrate deep-tissue video reconstruction of decorrelation dynamics. In this work, we utilize a single-photon avalanche diode (SPAD) array camera to simultaneously monitor the temporal dynamics of speckle fluctuations at the single-photon level from 12 different phantom tissue surface locations delivered via a customized fiber bundle array. We then apply a deep neural network to convert the acquired single-photon measurements into video of scattering dynamics beneath rapidly decorrelating tissue phantoms. We demonstrate the ability to reconstruct images of transient (0.1-0.4s) dynamic events occurring up to 8 mm beneath a decorrelating tissue phantom with millimeter-scale resolution, and highlight how our model can flexibly extend to monitor flow speed within buried phantom vessels.
Open Access
Linearly Converging Quasi Branch and Bound Algorithms for Global Rigid Registration
Dym, N; Kovalsky, S
In recent years, several branch-and-bound (BnB) algorithms have been proposed to globally optimize rigid registration problems. In this paper, we suggest a general framework to improve upon the BnB approach, which we name Quasi BnB. Quasi BnB replaces the linear lower bounds used in BnB algorithms with quadratic quasi-lower bounds which are based on the quadratic behavior of the energy in the vicinity of the global minimum. While quasi-lower bounds are not truly lower bounds, the Quasi-BnB algorithm is globally optimal. In fact we prove that it exhibits linear convergence -- it achieves $\epsilon$-accuracy in $~O(\log(1/\epsilon)) $ time while the time complexity of other rigid registration BnB algorithms is polynomial in $1/\epsilon $. Our experiments verify that Quasi-BnB is significantly more efficient than state-of-the-art BnB algorithms, especially for problems where high accuracy is desired.
Open Access
Malignancy Prediction and Lesion Identification from Clinical Dermatological Images.
(CoRR, 2021) Xia, Meng; Kheterpal, Meenal K; Wong, Samantha C; Park, Christine; Ratliff, William; Carin, Lawrence; Henao, Ricardo
We consider machine-learning-based malignancy prediction and lesion identification from clinical dermatological images, which can be indistinctly acquired via smartphone or dermoscopy capture. Additionally, we do not assume that images contain single lesions, thus the framework supports both focal or wide-field images. Specifically, we propose a two-stage approach in which we first identify all lesions present in the image regardless of sub-type or likelihood of malignancy, then it estimates their likelihood of malignancy, and through aggregation, it also generates an image-level likelihood of malignancy that can be used for high-level screening processes. Further, we consider augmenting the proposed approach with clinical covariates (from electronic health records) and publicly available data (the ISIC dataset). Comprehensive experiments validated on an independent test dataset demonstrate that i) the proposed approach outperforms alternative model architectures; ii) the model based on images outperforms a pure clinical model by a large margin, and the combination of images and clinical data does not significantly improves over the image-only model; and iii) the proposed framework offers comparable performance in terms of malignancy classification relative to three board certified dermatologists with different levels of experience.
Open Access
Mesoscopic photogrammetry with an unstabilized phone camera
(CVPR 2021, 2020-12-10) Zhou, Kevin C; Cooke, Colin; Park, Jaehee; Qian, Ruobing; Horstmeyer, Roarke; Izatt, Joseph A; Farsiu, Sina
We present a feature-free photogrammetric technique that enables quantitative 3D mesoscopic (mm-scale height variation) imaging with tens-of-micron accuracy from sequences of images acquired by a smartphone at close range (several cm) under freehand motion without additional hardware. Our end-to-end, pixel-intensity-based approach jointly registers and stitches all the images by estimating a coaligned height map, which acts as a pixel-wise radial deformation field that orthorectifies each camera image to allow homographic registration. The height maps themselves are reparameterized as the output of an untrained encoder-decoder convolutional neural network (CNN) with the raw camera images as the input, which effectively removes many reconstruction artifacts. Our method also jointly estimates both the camera's dynamic 6D pose and its distortion using a nonparametric model, the latter of which is especially important in mesoscopic applications when using cameras not designed for imaging at short working distances, such as smartphone cameras. We also propose strategies for reducing computation time and memory, applicable to other multi-frame registration problems. Finally, we demonstrate our method using sequences of multi-megapixel images captured by an unstabilized smartphone on a variety of samples (e.g., painting brushstrokes, circuit board, seeds).
Open Access
Physics-enhanced machine learning for virtual fluorescence microscopy
(CoRR, 2020-04-08) Cooke, Colin L; Kong, Fanjie; Chaware, Amey; Zhou, Kevin C; Kim, Kanghyun; Xu, Rong; Ando, D Michael; Yang, Samuel J; Konda, Pavan Chandra; Horstmeyer, Roarke
This paper introduces a new method of data-driven microscope design for virtual fluorescence microscopy. Our results show that by including a model of illumination within the first layers of a deep convolutional neural network, it is possible to learn task-specific LED patterns that substantially improve the ability to infer fluorescence image information from unstained transmission microscopy images. We validated our method on two different experimental setups, with different magnifications and different sample types, to show a consistent improvement in performance as compared to conventional illumination methods. Additionally, to understand the importance of learned illumination on inference task, we varied the dynamic range of the fluorescent image targets (from one to seven bits), and showed that the margin of improvement for learned patterns increased with the information content of the target. This work demonstrates the power of programmable optical elements at enabling better machine learning algorithm performance and at providing physical insight into next generation of machine-controlled imaging systems.
Open Access
(Quasi)Periodicity Quantification in Video Data, Using Topology
(2017-12-11) Tralie, CJ; Perea, JA
This work introduces a novel framework for quantifying the presence and strength of recurrent dynamics in video data. Specifically, we provide continuous measures of periodicity (perfect repetition) and quasiperiodicity (superposition of periodic modes with non-commensurate periods), in a way which does not require segmentation, training, object tracking or 1-dimensional surrogate signals. Our methodology operates directly on video data. The approach combines ideas from nonlinear time series analysis (delay embeddings) and computational topology (persistent homology), by translating the problem of finding recurrent dynamics in video data, into the problem of determining the circularity or toroidality of an associated geometric space. Through extensive testing, we show the robustness of our scores with respect to several noise models/levels; we show that our periodicity score is superior to other methods when compared to human-generated periodicity rankings; and furthermore, we show that our quasiperiodicity score clearly indicates the presence of biphonation in videos of vibrating vocal folds.
Open Access
RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks.
(CoRR, 2018) Cheng, X; Qiu, Q; Calderbank, R; Sapiro, G
Explicit encoding of group actions in deep features makes it possible for convolutional neural networks (CNNs) to handle global deformations of images, which is critical to success in many vision tasks. This paper proposes to decompose the convolutional filters over joint steerable bases across the space and the group geometry simultaneously, namely a rotation-equivariant CNN with decomposed convolutional filters (RotDCF). This decomposition facilitates computing the joint convolution, which is proved to be necessary for the group equivariance. It significantly reduces the model size and computational complexity while preserving performance, and truncation of the bases expansion serves implicitly to regularize the filters. On datasets involving in-plane and out-of-plane object rotations, RotDCF deep features demonstrate greater robustness and interpretability than regular CNNs. The stability of the equivariant representation to input variations is also proved theoretically under generic assumptions on the filters in the decomposed form. The RotDCF framework can be extended to groups other than rotations, providing a general approach which achieves both group equivariance and representation stability at a reduced model size.
Open Access
Self-Similarity Based Time Warping
(2017-12-11) Tralie, CJ
In this work, we explore the problem of aligning two time-ordered point clouds which are spatially transformed and re-parameterized versions of each other. This has a diverse array of applications such as cross modal time series synchronization (e.g. MOCAP to video) and alignment of discretized curves in images. Most other works that address this problem attempt to jointly uncover a spatial alignment and correspondences between the two point clouds, or to derive local invariants to spatial transformations such as curvature before computing correspondences. By contrast, we sidestep spatial alignment completely by using self-similarity matrices (SSMs) as a proxy to the time-ordered point clouds, since self-similarity matrices are blind to isometries and respect global geometry. Our algorithm, dubbed "Isometry Blind Dynamic Time Warping" (IBDTW), is simple and general, and we show that its associated dissimilarity measure lower bounds the L1 Gromov-Hausdorff distance between the two point sets when restricted to warping paths. We also present a local, partial alignment extension of IBDTW based on the Smith Waterman algorithm. This eliminates the need for tedious manual cropping of time series, which is ordinarily necessary for global alignment algorithms to function properly.
Open Access
Stop memorizing: A data-dependent regularization framework for intrinsic pattern learning
Zhu, Wei; Qiu, Qiang; Wang, Bao; Lu, Jianfeng; Sapiro, Guillermo; Daubechies, Ingrid
Deep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed. We find out that the reason for this is the inconsistency between the enforced geometry and the standard softmax cross entropy loss. To resolve this, we propose a new framework for data-dependent DNN regularization, the Geometrically-Regularized-Self-Validating neural Networks (GRSVNet). During training, the geometry enforced on one batch of features is simultaneously validated on a separate batch using a validation loss consistent with the geometry. We study a particular case of GRSVNet, the Orthogonal-Low-rank Embedding (OLE)-GRSVNet, which is capable of producing highly discriminative features residing in orthogonal low-rank subspaces. Numerical experiments show that OLE-GRSVNet outperforms DNNs with conventional regularization when trained on real data. More importantly, unlike conventional DNNs, OLE-GRSVNet refuses to memorize random data or random labels, suggesting it only learns intrinsic patterns by reducing the memorizing capacity of the baseline DNN.
Open Access
Thyroid Cancer Malignancy Prediction From Whole Slide Cytopathology Images
Dov, David; Kovalsky, Shahar; Cohen, Jonathan; Range, Danielle; Henao, Ricardo; Carin, Lawrence
We consider preoperative prediction of thyroid cancer based on ultra-high-resolution whole-slide cytopathology images. Inspired by how human experts perform diagnosis, our approach first identifies and classifies diagnostic image regions containing informative thyroid cells, which only comprise a tiny fraction of the entire image. These local estimates are then aggregated into a single prediction of thyroid malignancy. Several unique characteristics of thyroid cytopathology guide our deep-learning-based approach. While our method is closely related to multiple-instance learning, it deviates from these methods by using a supervised procedure to extract diagnostically relevant regions. Moreover, we propose to simultaneously predict thyroid malignancy, as well as a diagnostic score assigned by a human expert, which further allows us to devise an improved training strategy. Experimental results show that the proposed algorithm achieves performance comparable to human experts, and demonstrate the potential of using the algorithm for screening and as an assistive tool for the improved diagnosis of indeterminate cases.
Open Access
Towards an Intelligent Microscope: adaptively learned illumination for optimal sample classification.
(CoRR, 2019) Chaware, A; Cooke, CL; Kim, K; Horstmeyer, R
Recent machine learning techniques have dramatically changed how we process digital images. However, the way in which we capture images is still largely driven by human intuition and experience. This restriction is in part due to the many available degrees of freedom that alter the image acquisition process (lens focus, exposure, filtering, etc). Here we focus on one such degree of freedom - illumination within a microscope - which can drastically alter information captured by the image sensor. We present a reinforcement learning system that adaptively explores optimal patterns to illuminate specimens for immediate classification. The agent uses a recurrent latent space to encode a large set of variably-illuminated samples and illumination patterns. We train our agent using a reward that balances classification confidence with image acquisition cost. By synthesizing knowledge over multiple snapshots, the agent can classify on the basis of all previous images with higher accuracy than from naively illuminated images, thus demonstrating a smarter way to physically capture task-specific information.

Browsing by Subject "cs.CV"

Results Per Page

Sort Options