Browsing by Subject "Signal processing"
Results Per Page
Sort Options
Item Open Access Adaptive Data Representation and Analysis(2018) Xu, JierenThis dissertation introduces and analyzes algorithms that aim to adaptively handle complex datasets arising in the real-world applications. It contains two major parts. The first part describes an adaptive model of 1-dimensional signals that lies in the field of adaptive time-frequency analysis. It explains a current state-of-the-art work, named the Synchrosqueezed transform, in this field. Then it illustrates two proposed algorithms that use non-parametric regression to reveal the underlying os- cillatory patterns of the targeted 1-dimensional signal, as well as to estimate the instantaneous information, e.g., instantaneous frequency, phase, or amplitude func-
tions, by a statistical pattern driven model.
The second part proposes a population-based imaging technique for human brain
bundle/connectivity recovery. It applies local streamlines as novelly adopted learn- ing/testing features to segment the brain white matter and thus reconstruct the whole brain information. It also develops a module, named as the streamline diffu- sion filtering, to improve the streamline sampling procedure.
Even though these two parts are not related directly, they both rely on an align- ment step to register the latent variables to some coordinate system and thus to facilitate the final inference. Numerical results are shown to validate all the pro- posed algorithms.
Item Open Access Applied Millimeter Wave Radar Vibrometry(2023) Centers, JessicaIn this dissertation, novel uses of millimeter-wave (mmW) radars are developed and analyzed. While automotive mmW radars have been ubiquitous in advanced driver assistance systems (ADAS), their ability to sense motions at sub-millimeter scale allows them to also find application in systems that require accurate measurements of surface vibrations. While laser Doppler vibrometers (LDVs) are routinely used to measure such vibrations, the lower size, weight, power, and cost (SWAPc) of mmW radars make vibrometry viable for a variety of new applications. In this work, we consider two such applications: everything-to-vehicle (X2V) wireless communications and non-acoustic human speech analysis.
Within this dissertation, a wireless communication system that uses the radar as a vibrometer is introduced. This system, termed vibrational radar backscatter communications (VRBC), receives messages by observing phase modulations on the radar signal that are caused by vibrations on the surface of a transponder over time. It is shown that this form of wireless communication provides the ability to simultaneously detect, isolate, and decode messages from multiple sources thanks to the spatial resolution of the radar. Additionally, VRBC requires no RF emission on the end of the transponder. Since automotive radars and the conventional X2V solutions are often at odds for spectrum allocations, this characteristic of VRBC is incredibly valuable.
Using an off-the-shelf, resonant transponder, a real VRBC data collection is presented and used to demonstrate the signal processing techniques necessary to decode a VRBC message. This real data collection proves to achieve a data rate just under 100 bps at approximately 5 meters distance. Rates of this scale can provide warning messages or concise situational awareness information in applications such as X2V, but naturally higher rates are desirable. For that reason, this dissertation includes discussion on how to design a more optimal VRBC system via transponder design, messaging scheme choice, and using any afforded flexibility in radar parameter choice.
Through the use of an analytical upper bound on VRBC rate and simulation results, we see that rates closer to 1 kbps should be achievable for a transponder approximately the size of a license plate at ranges under 200 meters. The added benefits of requiring no RF spectrum or network scheduling protocols uniquely positions VRBC as a desirable solution in spaces like X2V over commonly considered, higher rate solutions such as direct short range communications (DSRC).
Upon implementing a VRBC system, a handful of complications were encountered. This document designates a full chapter to solving these cases. This includes properly modeling intersymbol interference caused by resonant surfaces and utilizing sequence detection methods rather than single symbol maximum likelihood methods to improve detection in these cases. Additionally, an analysis on what an ideal clutter filter should look like and how it can begin to be achieved is presented. Lastly, a method for mitigating platform vibrational noise at both the radar and the transponder are presented. Using these methods, message detection errors are better avoided, though more optimal system design fundamentally proves to limit what rates are achievable.
Towards non-acoustic human speech analysis, it is shown in this dissertation that the vibrations of a person's throat during speech generation can be accurately captured using a mmW radar. These measurements prove to be similar to those achieved by the more expensive vibrometry alternative of an LDV with less than 10 dB of SNR depreciation at the first two speech harmonics in the signal's spectrogram. Furthermore, we find that mmW radar vibrometry data resembles a low-pass filtered version of its corresponding acoustic data. We show that this type of data achieves 53% performance in a speaker identification system as opposed to 11\% in a speech recognition system. This performance suggests potential for a mmW radar vibrometry in context-blind speaker identification systems if the performance of the speaker identification system can be further improved without causing the context of the speech more recognizable.
In this dissertation, mmW radar vibrational returns are modelled and signal processing chains are provided to allow for these vibrations to be estimated and used in application. In many cases, the work outlined could be used in other areas of mmW radar vibrometry even though it was originally motivated by potentially unrelated applications. It is the hope of this dissertation that the provided models, signal processing methods, visualizations, analytical bound, and results not only justify mmW radar in human speech analysis and backscatter communications, but that they also contribute to the community's understanding of how certain vibrational movements can be best observed, processed, and made useful more broadly.
Item Open Access Bayesian and Information-Theoretic Learning of High Dimensional Data(2012) Chen, MinhuaThe concept of sparseness is harnessed to learn a low dimensional representation of high dimensional data. This sparseness assumption is exploited in multiple ways. In the Bayesian Elastic Net, a small number of correlated features are identified for the response variable. In the sparse Factor Analysis for biomarker trajectories, the high dimensional gene expression data is reduced to a small number of latent factors, each with a prototypical dynamic trajectory. In the Bayesian Graphical LASSO, the inverse covariance matrix of the data distribution is assumed to be sparse, inducing a sparsely connected Gaussian graph. In the nonparametric Mixture of Factor Analyzers, the covariance matrices in the Gaussian Mixture Model are forced to be low-rank, which is closely related to the concept of block sparsity.
Finally in the information-theoretic projection design, a linear projection matrix is explicitly sought for information-preserving dimensionality reduction. All the methods mentioned above prove to be effective in learning both simulated and real high dimensional datasets.
Item Open Access Classification and Characterization of Heart Sounds to Identify Heart Abnormalities(2019) LaPorte, EmmaThe main function of the human heart is to act as a pump, facilitating the delivery of oxygenated blood to the many cells within the body. Heart failure (HF) is the medical condition in which a heart cannot adequately pump blood to the body, often resulting from other conditions such as coronary artery disease, previous heart attacks, high blood pressure, diabetes, or abnormal heart valves. HF afflicts approximately 6.5 million adults in the US alone [1] and manifests itself often in the form of fatigue, shortness of breath, increased heart rate, confusion, and more, resulting in a lower quality of life for those afflicted. At the earliest stage of HF, an adequate treatment plan could be relatively manageable, including healthy lifestyle changes such as eating better and exercising more. However, the symptoms (and the heart) worsen overtime if left untreated, requiring more extreme treatment such as surgical intervention and/or a heart transplant [2]. Given the magnitude of this condition, there is potential for large impact both in (1) automating (and thus expediting) the diagnosis of HF and (2) in improving HF treatment options and care. These topics are explored in this work.
An early diagnosis of HF is beneficial because HF left untreated will result in an increasingly severe condition, requiring more extreme treatment and care. Typically, HF is first diagnosed by a physician during auscultation, which is the act of listening to sounds from the heart through a stethoscope [3]. Therefore, physicians are trained to listen to heart sounds and identify them as normal or abnormal. Heart sounds are the acoustic result of the internal pumping mechanism of the heart. Therefore, when the heart is functioning normally, there is a resulting acoustic spectrum representing normal heart sounds, that a physician listens to and identifies as normal. However, when the heart is functioning abnormally, there is a resulting acoustic spectrum that differs from normal heart sounds, that a physician listens to and identifies as abnormal [3]–[5].
One goal of this work is to automate the auscultation process by developing a machine learning algorithm to identify heart sounds as normal or abnormal. An algorithm is developed for this work that extracts features from a digital stethoscope recording and classifies the recording as normal or abnormal. An extensive feature extraction and selection analysis is performed, ultimately resulting in a classification algorithm with an accuracy score of 0.85. This accuracy score is comparable to current high-performing heart sound classification algorithms [6].
The purpose of the first portion of this work is to automate the HF diagnosis process, allowing for more frequent diagnoses and at an earlier stage of HF. For an individual already diagnosed with HF, there is potential to improve current treatment and care. Specifically, if the HF is extreme, an individual may require a surgically implanted medical device called a Left Ventricular Assist Device (LVAD). The purpose of an LVAD is to assist the heart in pumping blood when the heart cannot adequately do so on its own. Although life-saving, LVADs have a high complication rate. These complications are difficult to identify prior to a catastrophic outcome. Therefore, there is a need to monitor LVAD patients to identify these complications. Current methods of monitoring individuals and their LVADs are invasive or require an in-person hospital visit. Acoustical monitoring has the potential to non-invasively remotely monitor LVAD patients to identify abnormalities at an earlier stage. However, this is made difficult because the LVAD pump noise obscures the acoustic spectrum of the native heart sounds.
The second portion of this work focuses on this specific case of HF, in which an individual’s treatment plan includes an LVAD. A signal processing pipeline is proposed to extract the heart sounds in the presence of the LVAD pump noise. The pipeline includes down sampling, filtering, and a heart sound segmentation algorithm to identify states of the cardiac cycle: S1, S2, systole, and diastole. These states are validated using two individuals’ digital stethoscope recordings by comparing the labeled states to the characteristics expected of heart sounds. Both subjects’ labeled states closely paralleled the expectations of heart sounds, validating the signal processing pipeline developed for this work.
This exploratory analysis can be furthered with the ongoing data collection process. With enough data, the goal is to extract clinically relevant information from the underlying heart sounds to assess cardiac function and identify LVAD disfunction prior to a catastrophic outcome. Ultimately, this non-invasive, remote model will allow for earlier diagnosis of LVAD complications.
In total, this work serves two main purposes: the first is developing a machine learning algorithm that automates the HF diagnosis process; the second is extracting heart sounds in the presence of LVAD noise. Both of these topics further the goal of earlier diagnosis and therefore better outcomes for those afflicted with HF.
Item Open Access Compressive Sensing in Transmission Electron Microscopy(2018) Stevens, AndrewElectron microscopy is one of the most powerful tools available in observational science. Magnifications of 10,000,000x have been achieved with picometer precision. At this high level of magnification, individual atoms are visible. This is possible because the wavelength of electrons is much smaller than visible light, which also means that the highly focused electron beams used to perform imaging contain significantly more energy than visible light. The beam energy is high enough that it can cause radiation damage to metal specimens. Reducing radiation dose while maintaining image quality has been a central research topic in electron microscopy for several decades. Without the ability to reduce the dose, most organic and biological specimens cannot be imaged at atomic resolution. Fundamental processes in materials science and biology arise at the atomic level, thus understanding these processes can only occur if the observational tools can capture information with atomic resolution.
The primary objective of this research is to develop new techniques for low dose and high resolution imaging in (scanning) transmission electron microscopy (S/TEM). This is achieved through the development of new machine learning based compressive sensing algorithms and microscope hardware for acquiring a subset of the pixels in an image. Compressive sensing allows recovery of a signal from significantly fewer measurements than total signal size (under certain conditions). The research objective is attained by demonstrating application of compressive sensing to S/TEM in several simulations and real microscope experiments. The data types considered are images, videos, multispectral images, tomograms, and 4-dimensional ptychographic data. In the simulations, image quality and error metrics are defined to verify that reducing dose is possible with a small impact on image quality. In the microscope experiments, images are acquired with and without compressive sensing so that a qualitative verification can be performed.
Compressive sensing is shown to be an effective approach to reduce dose in S/TEM without sacrificing image quality. Moreover, it offers increased acquisition speed and reduced data size. Research leading to this dissertation has been published in 25 articles or conference papers and 5 patent applications have been submitted. The published papers include contributions to machine learning, physics, chemistry, and materials science. The newly developed pixel sampling hardware is being productized so that other microscopists can use compressive sensing in their experiments. In the future, scientific imaging devices (e.g., scanning transmission x-ray microscopy (STXM) and secondary-ion mass spectrometry (SIMS)) could also benefit from the techniques presented in this dissertation.
Item Open Access Detection and Classification of Whale Acoustic Signals(2016) Xian, YinThis dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.
In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.
In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.
Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.
We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.
Item Open Access Efficient and Collaborative Methods for Distributed Machine Learning(2023) Diao, EnmaoIn recent years, there has been a significant expansion in the scale and complexity of neural networks. This has resulted in significant demand for data, computation, and energy resources. In this light, it is crucial to enhance and optimize the efficiency of these ML models and algorithms. Additionally, the rise in computational capabilities of modern devices has prompted a shift towards distributed systems that enable localized data storage and model training. While this evolution promises substantial potential, it introduces a series of challenges. Such challenges encompass addressing the heterogeneity across systems, data, models, and supervision, balancing the trade-off among communication, computation, and performance, as well as building a community of shared interest to encourage collaboration in the emerging era of Artificial General Intelligence (AGI). In this dissertation, we contribute to the establishment of a theoretically justified, methodologically comprehensive, and universally applicable Efficient and Collaborative Distributed Machine Learning framework. Specifically, in Part I, we contribute to methodologies for Efficient Machine Learning including for both learning and inference. In this direction, we propose a parameter-efficient model, namely Restricted Recurrent Neural Networks (RRNN), that leverage the recurrent structures of RNNs using weight sharing in order to improve learning efficiency. We also introduce an optimal measure of vector sparsity named the PQ Index (PQI), and postulate a hypothesis connecting this sparsity measure and compressibility of neural networks. Based on this, we propose a Sparsity-informed Adaptive Pruning (SAP) algorithm. This algorithm adaptively determines the pruning ratio to enhance inference efficiency. In Part II, we address both efficiency and collaboration in Distributed Machine Learning. We introduce Distributed Recurrent Autoencoders for Scalable Image Compression (DRASIC), a data-driven Distributed Source Coding framework that can compress heterogeneous data in a scalable and distributed manner. We then propose Heterogeneous Federated Learning (HeteroFL), demonstrating the feasibility of training localized heterogeneous models to create a global inference model. Subsequently, we propose a new Federated Learning (FL) framework, namely SemiFL, to tackle Semi-Supervised Federated Learning (SSFL) for clients with unlabeled data. This method performs comparably with state-of-the-art centralized Semi-Supervised Learning (SSL), and fully supervised FL techniques. Finally, we propose Gradient Assisted Learning (GAL) in order to enable collaborations among multiple organizations without sharing data, models, and objective functions. This method significantly outperforms local learning baselines and achieves near-oracle performance. In Part III, we develop collaborative applications for building a community of shared interest. We apply SemiFL to Keyword Spotting (KWS), a technique widely used in virtual assistants. Numerical experiments demonstrate that one can train models from the scratch, or transfer from pre-trained models in order to leverage heterogeneous unlabeled on-device data, using only a small amount of labeled data from the server. Finally, we propose a Decentralized Multi-Target Cross-Domain Recommendation (DMTCDR) which enhances the recommendation performance of decentralized organizations without compromising data privacy or model confidentiality.
Item Open Access High Resolution Continuous Active Sonar(2017) Soli, Jonathan BoydThis dissertation presents waveform design and signal processing methods for continuous active sonar (CAS). The work presented focuses on methods for achieving high range, Doppler, and angular resolution, while maintaining a high signal-to-interference plus noise ratio (SINR).
CAS systems transmit at or near 100\% duty cycle for improved update rates compared to pulsed systems. For this reason, CAS is particularly attractive for use in shallow, reverberation-limited environments to provide more ``hits'' to adequately reject false alarms due to reverberation. High resolution is particularly important for CAS systems operating in shallow water for three reasons: (1) To separate target returns from the direct blast, (2) To separate targets from reverberation, and (3) To resolve direct and multipath target returns for maximum SINR. This dissertation presents two classes of high resolution CAS waveform designs and complementary signal processing techniques.
The first class of waveforms presented are co-prime comb signals that achieve high range and Doppler resolution at the cost of range ambiguities. Co-prime combs consist of multiple tones at non-uniformly spaced frequencies according to a 2-level nested co-prime array. Specialized non-matched filter processing enables recovery of a range-velocity response similar to that of a uniform comb, but using fewer tonal components. Cram\'er-Rao Bounds on range and Doppler estimation errors are derived for an arbitrary comb signal and used as a benchmark for comparing three range-velocity processing algorithms. Co-prime comb results from the littoral CAS 2015 (LCAS-15) sea trial are also presented, as well as a strategy to mitigate range ambiguities. An adaptive beamformer that achieves high angular resolution is also presented that leverages the various tonal components of the waveform for snapshot support.
The second class of waveforms presented are slow-time Costas (SLO-CO) CAS signals that achieve high range resolution, but are relatively insensitive to Doppler. SLO-CO CAS signals consist of multiple short duration linear FM (LFM) chirps that are frequency-hopped according to a Costas code. Rapid range updates can be achieved by processing each SLO-CO sub-chirp independently in a cyclical manner. Results from the LCAS-15 trial validate the performance of a SLO-CO signal in a real shallow water environment. A range processing method, novel to sonar, called bandwidth synthesis (BWS) is also presented. This method uses autoregressive modeling together with linear-predictive extrapolation to synthetically extend the bandwidth of received sonar returns. It is shown that BWS results in increased SINR and improved range resolution over conventional matched filtering in the reverberation-limited LCAS-15 environment.
Item Open Access Learning from Geometry(2016) Huang, JiajiSubspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
Item Open Access MULTITAPER WAVE-SHAPE F-TEST FOR DETECTING NON-SINUSOIDAL OSCILLATIONS(2023-04-25) Liu, YijiaMany practical periodic signals are not sinusoidal and contami nated by complicated noise. The traditional spectral approach is limited in this case due to the energy spreading caused by the non-sinusoidal oscillation. We systematically study the multitaper spectral estimate and generalize the Thomson’s F-statistic under the setup physically dependent random process to analyze periodic signals of this kind. The developed statistic is applied to estimate the walking activity from the actinogram signals.Item Open Access Noisefield Estimation, Array Calibration and Buried Threat Detection in the Presence of Diffuse Noise(2019) Bjornstad, Joel NilsOne issue associated with all aspects of the signal processing and decision making fields is that signals of interest are corrupted by noise. This work specifically considers scenarios where the primary noise source is external to an array of receivers and is diffuse. Spatially diffuse noise is considered in three scenarios: noisefield estimation, array calibration using diffuse noise as a source of opportunity, and detection of buried threats using Ground Penetrating Radar (GPR).
Modeling the ocean acoustic noise field is impractical as the noise seen by a receiver is dependent on the position of distant shipping (a major contributing source of low frequency noise) as well as the temperature, pressure, salinity and bathymetry of the ocean. Measuring the noise field using a standard towed array is also not practical due the inability of a line array to distinguish signals arriving at different elevations as well the presence of the well-known left/right ambiguity. A method to estimate the noisefield by fusing data from a traditional towed array and two small-aperture planar arrays is developed. The resulting noise field estimates can be used to produce synthetic covariance matrices that exhibit parity performance with measured covariance matrices when used in a Matched Subspace Detector.
For a phased array to function effectively, the positions of the array elements must be well calibrated. Previous efforts in the literature have primarily focused on use of discrete sources for calibration. The approach taken here focuses on using spatially oversampled, overlapping sub-arrays. The distance between elements is determine using The geometry of each individual sub-array is determined using Maximum Likelihood estimates of the interelement distances and determining the geometry of each sub array using Multidimensional Scaling. The overlapping sub-arrays are then combined into a single array. The algorithm developed in this work performs well in simulation. Limitations in the experimental setup preclude drawing firm conclusions based on an in-air test of the algorithm.
Ground penetrating radar (GPR) is one of the most successful methods to detect landmines and other buried threats. GPR images, however, are very noisy as the propagation path through soil is quite complex. It is a challenging problem to classify GPR images as threats or non-threats. Successful buried threat classification algorithm rely on a handcrafted feature descriptor paired with a machine learning classifier. In this work the state-of-the-art Spatial Edge Descriptor (SED) feature was implemented as a neural network. This implementation allows the feature and the classifier to be trained simultaneously and expanded with minimal intervention from a designer. Impediments to training this novel network were identified and a modified network proposed that surpasses the performance of the baseline SED algorithm.
These cases demonstrate the practicality of mitigating or using diffuse background noise to achieve desired engineering results.
Item Open Access Nonparametric Bayesian Context Learning for Buried Threat Detection(2012) Ratto, Christopher RalphThis dissertation addresses the problem of detecting buried explosive threats (i.e., landmines and improvised explosive devices) with ground-penetrating radar (GPR) and hyperspectral imaging (HSI) across widely-varying environmental conditions. Automated detection of buried objects with GPR and HSI is particularly difficult due to the sensitivity of sensor phenomenology to variations in local environmental conditions. Past approahces have attempted to mitigate the effects of ambient factors by designing statistical detection and classification algorithms to be invariant to such conditions. These methods have generally taken the approach of extracting features that exploit the physics of a particular sensor to provide a low-dimensional representation of the raw data for characterizing targets from non-targets. A statistical classification rule is then usually applied to the features. However, it may be difficult for feature extraction techniques to adapt to the highly nonlinear effects of near-surface environmental conditions on sensor phenomenology, as well as to re-train the classifier for use under new conditions. Furthermore, the search for an invariant set of features ignores that possibility that one approach may yield best performance under one set of terrain conditions (e.g., dry), and another might be better for another set of conditions (e.g., wet).
An alternative approach to improving detection performance is to consider exploiting differences in sensor behavior across environments rather than mitigating them, and treat changes in the background data as a possible source of supplemental information for the task of classifying targets and non-targets. This approach is referred to as context-dependent learning.
Although past researchers have proposed context-based approaches to detection and decision fusion, the definition of context used in this work differs from those used in the past. In this work, context is motivated by the physical state of the world from which an observation is made, and not from properties of the observation itself. The proposed context-dependent learning technique therefore utilized additional features that characterize soil properties from the sensor background, and a variety of nonparametric models were proposed for clustering these features into individual contexts. The number of contexts was assumed to be unknown a priori, and was learned via Bayesian inference using Dirichlet process priors.
The learned contextual information was then exploited by an ensemble on classifiers trained for classifying targets in each of the learned contexts. For GPR applications, the classifiers were trained for performing algorithm fusion For HSI applications, the classifiers were trained for performing band selection. The detection performance of all proposed methods were evaluated on data from U.S. government test sites. Performance was compared to several algorithms from the recent literature, several which have been deployed in fielded systems. Experimental results illustrate the potential for context-dependent learning to improve detection performance of GPR and HSI across varying environments.
Item Open Access On Locating Unstable Equilibria and Probing Potential Energy Fields in Nonlinear Systems Using Experimental Data(2020) Xu, YawenThis study focuses on a series of data-driven methods to study nonlinear dynamic systems. First, a new method to estimate the location of unstable equilibria, specifically saddle-points, based on transient trajectories from experiments is proposed. We describe a system in which saddle-points (not easily observed in a direct sense) influence the behavior of trajectories that pass `close-by' them. This influence is used to construct a model and thus identify a more accurate estimate of the location using a number of refinements associated with linearization and regression. The method is verified on a rolling-ball model. Both simulations and experiments were conducted. The experiments consists of a small ball rolling on a relatively shallow curved surface under the influence of gravity: a potential energy surface in two dimensions. Tracking the motion of the ball with a digital camera provides data that compares closely with the output of numerical simulation. The experimental results suggest that this method can effectively locate the saddle equilibria in a system, and the robustness of the approach is assessed relative to the effect of noise, size of the local neighborhood, etc., in addition to providing information on the local dynamics. Given the relative simplicity of the experiment system used and a-priori knowledge of the saddle-points, it is a useful testing environment for system identification in a nonlinear context. Furthermore, a post-buckled beam model is used to test this method. Because in real world applications, continuous elastic structures are more common. The experiment results successfully capture both the stable and unstable configurations. However, the natural frequency provided by this regression method underestimates the natural frequency of the second mode. This is the result of low sampling rate in the experiment which leads to inaccurate estimation of velocity and acceleration from numerical differentiation. Simulation results from finite element method with higher sampling rate do not have this issue.
Then, a method to identify potential energy through probing a force field is presented. A small ball resting on a curve in a gravitational field offers a simple and compelling example of potential energy. The force required to move the ball, or to maintain it in a given position on a slope, is the negative of the vector gradient of the potential field: the steeper the curve, the greater the force required to push the ball up the hill (or keep it from rolling down). We thus observe the turning points (horizontal tangency) of the potential energy shape as positions of equilibrium (in which case the 'restoring force' drops to zero). We appeal directly to this type of system using both one and two-dimensional shapes: curves and surfaces. The shapes are produced to a desired mathematical form generally using additive manufacturing, and we use a combination of load cells to measure the forces acting on a small steel ball-bearing subject to gravity. The measured forces, as a function of location, are then subject to integration to recover the potential energy function. The utility of this approach, in addition to pedagogical clarity, concerns extension and applications to more complex systems in which the potential energy would not be typically known {\it a priori}, for example, in nonlinear structural mechanics in which the potential energy changes under the influence of a control parameter, but there is the possibility of force {\it probing} the configuration space. A brief example of applying this approach to a 1-D simple elastic structure is also presented. For multi-dimensional continuous elastic systems, it would be hard to derive the whole potential energy field. However, it is possible to learn the potential energy difference between different equilibria. This information could help us learn the global stability of the stable equilibria, \textit{i.e.}, how much energy is required to escape from the stable equilibria.
Finally, a case study using the two above-mentioned methods on short square box columns is presented. This case study relies on simulation from the finite element method. The buckling of short square box column is dominated by the local buckling of the panel on each side of the column. Hence, the buckling of short box columns shares strong similarities with the buckling of a rectangular panel under uni-axial load. The primary, secondary and tertiary
bifurcation of a series of square box columns with different height-to-width ratio is presented. Then, we focus on the column with height-to-width ratio of 1.4142, in which the primary and second bifurcation would happen almost simultaneously. And thus, the differences in the energy level between different stable equilibria are important. The simulation results show that after the secondary bifurcation, the energy `well' depth for these stable equilibria are similar initially. With the further increase of buckling load, the energy well for the second mode is deeper and the second mode becomes the more stable configuration. We also study the dynamic snap-through of the post-buckled column. The regression method is used to estimate the equilibria configuration and the natural frequencies with great accuracy. We notice an interesting phenomenon, there can be an energy exchange between different sides of the box column and hence, the real parts of the eigenvalue of the Jacobian matrix are positive if we only take the shape of one surface into account, whereas, if we take two next surfaces into the regression method, the real parts become negative.
Item Open Access Sensor Array Processing with Manifold Uncertainty(2013) Odom, Jonathan LawrenceThe spatial spectrum, also known as a field directionality map, is a description of the spatial distribution of energy in a wavefield. By sampling the wavefield at discrete locations in space, an estimate of the spatial spectrum can be derived using basic wave propagation models. The observable data space corresponding to physically realizable source locations for a given array configuration is referred to as the array manifold. In this thesis, array manifold ambiguities for linear arrays of omni-directional sensors in non-dispersive fields are considered.
First, the problem of underwater a hydrophone array towed behind a maneuvering platform is considered. The array consists of many hydrophones mounted to a flexible cable that is pulled behind a ship. The towed cable will bend or distort as the ship performs maneuvers. The motion of the cable through the turn can be used to resolve ambiguities that are inherent to nominally linear arrays. The first significant contribution is a method to estimate the spatial spectrum using a time-varying array shape in a dynamic field and broadband temporal data. Knowledge of the temporal spectral shape is shown to enhance detection performance. The field is approximated as a sum of uncorrelated planewaves located at uniform locations in angle, forming a gridded map on which a maximum likelihood estimate for broadband source power is derived. Uniform linear arrays also suffer from spatial aliasing when the inter-element spacing exceeds a half-wavelength. Broadband temporal knowledge is shown to significantly reduce aliasing and thus, in simulation, enhance target detection in interference dominated environments.
As an extension, the problem of towed array shape estimation is considered when the number and location of sources are unknown. A maximum likelihood estimate of the array shape using the field directionality map is derived. An acoustic-based array shape estimate that exploits the full 360$^\circ$ field via field directionality mapping is the second significant contribution. Towed hydrophone arrays have heading sensors in order to estimate array shape, but these sensors can malfunction during sharp turns. An array shape model is described that allows the heading sensor data to be statistically fused with heading sensor. The third significant contribution is method to exploit dynamical motion models for sharp turns for a robust array shape estimate that combines acoustic and heading data. The proposed array shape model works well for both acoustic and heading data and is valid for arbitrary continuous array shapes.
Finally, the problem of array manifold ambiguities for static under-sampled linear arrays is considered. Under-sampled arrays are non-uniformly sampled with average spacing greater than a half-wavelength. While spatial aliasing only occurs in uniformly sampled arrays with spacing greater than a half-wavelength, under-sampled arrays have increased spatial resolution at the cost of high sidelobes compared to half-wavelength sampled arrays with the same number of sensors. Additionally, non-uniformly sampled arrays suffer from rank deficient array manifolds that cause traditional subspace based techniques to fail. A class of fully agumentable arrays, minimally redundant linear arrays, is considered where the received data statistics of a uniformly spaced array of the same length can be reconstructed in wide sense stationary fields at the cost of increased variance. The forth significant contribution is a reduced rank processing method for fully augmentable arrays to reduce the variance from augmentation with limited snapshots. Array gain for reduced rank adaptive processing with diagonal loading for snapshot deficient scenarios is analytically derived using asymptotic results from random matrix theory for a set ratio of sensors to snapshots. Additionally, the problem of near-field sources is considered and a method to reduce the variance from augmentation is proposed. In simulation, these methods result in significant average and median array gains with limited snapshots.
Item Open Access Signal Improvement and Contrast Enhancement in Magnetic Resonance Imaging(2015) Han, YiThis thesis reports advances in magnetic resonance imaging (MRI), with the ultimate goal of improving signal and contrast in biomedical applications. More specifically, novel MRI pulse sequences have been designed to characterize microstructure, enhance signal and contrast in tissue, and image functional processes. In this thesis, rat brain and red bone marrow images are acquired using iMQCs (intermolecular multiple quantum coherences) between spins that are 10 μm to 500 μm apart. As an important application, iMQCs images in different directions can be used for anisotropy mapping. We investigate tissue microstructure by analyzing anisotropy mapping. At the same time, we simulated images expected from rat brain without microstructure. We compare those with experimental results to prove that the dipolar field from the overall shape only has small contributions to the experimental iMQC signal. Besides magnitude of iMQCs, phase of iMQCs should be studied as well. The phase anisotropy maps built by our method can clearly show susceptibility information in kidneys. It may provide meaningful diagnostic information. To deeply study susceptibility, the modified-crazed sequence is developed. Combining phase data of modified-crazed images and phase data of iMQCs images is very promising to construct microstructure maps. Obviously, the phase image in all above techniques needs to be highly-contrasted and clear. To achieve the goal, algorithm tools from Susceptibility-Weighted Imaging (SWI) and Susceptibility Tensor Imaging (STI) stands out superb useful and creative in our system.
Item Open Access Speaker Diarization with Deep Learning: Refinement, Online Extension and ASR Integration(2023) Wang, WeiqingAs speech remains an essential mode of human communication, the necessity for advanced technologies in speaker diarization has risen significantly. Speaker diarization is the process of accurately annotating individual speakers within an audio segment, and this dissertation explores within this domain, systematically addressing three prevailing challenges through intertwined strands of investigation.
Initially, we focus on the intricacies of overlapping speech and refine the conventional diarization systems with the sequential information integrated. Our approach not only recognizes these overlapping segments but also discerns the distinct speaker identities contained within, ensuring that each speaker is precisely categorized.
Transitioning from the challenge of overlapping speech, we then address the pressing need for real-time speaker diarization. In response to the growing need for low-latency applications in various fields, such as smart agents and transcription services, our research adapts traditional systems, enhancing them to function seamlessly in real-time applications without sacrificing accuracy or efficiency.
Lastly, we turn our attention to the vast reservoir of the potential that lies within contextual and textual data. Incorporating both audio and text data into speaker diarization not only augments the system's ability to distinguish speakers but also leverages the rich contextual cues often embedded in conversations, further improving the overall diarization performance.
Through a coherent and systematic exploration of these three pivotal areas, the dissertation offers substantial contributions to the field of speaker diarization. The research navigates through the challenges of overlapping speech, real-time application demands, and the integration of contextual data, ultimately presenting a refined, reliable, and efficient speaker diarization system poised for application in diverse and dynamic communication environments.
Item Open Access Two New Methods to Improve Adaptive Time-Frequency Localization(2021) Chen, ZiyuThis dissertation introduces algorithms that analyze oscillatory signals adaptively. It consists of three chapters. The first chapter reviews the adaptive time-frequency analysis of 1-dimensional signals. It introduces models that capture the time-varying behavior of oscillatory signals. Then it explains two state-of-the-art algorithms, named the SynchroSqueezed Transform (SST) and the Concentration of Frequency and Time (ConceFT), that extract the instantaneous information of signals; this chapter ends with a discussion of some of the shortcomings of SST and ConceFT, which will be remedied by the new methods introduced in the remainder of this thesis. The second chapter introduces the Ramanujan DeShape Algorithm (RDS); it incorporates the periodicity transform to extract adaptively the fundamental frequency of a non-harmonic signal. The third part proposes an algorithm that rotates the time-frequency content of an oscillatory signal to obtain a time-frequency representation that has fewer artifacts. Numerical results illustrate the theoretical analysis.