Browsing by Subject "Information theory"
- Results Per Page
- Sort Options
Item Open Access Adaptive Brain-Computer Interface Systems For Communication in People with Severe Neuromuscular Disabilities(2016) Mainsah, Boyla OBrain-computer interfaces (BCI) have the potential to restore communication or control abilities in individuals with severe neuromuscular limitations, such as those with amyotrophic lateral sclerosis (ALS). The role of a BCI is to extract and decode relevant information that conveys a user's intent directly from brain electro-physiological signals and translate this information into executable commands to control external devices. However, the BCI decision-making process is error-prone due to noisy electro-physiological data, representing the classic problem of efficiently transmitting and receiving information via a noisy communication channel.
This research focuses on P300-based BCIs which rely predominantly on event-related potentials (ERP) that are elicited as a function of a user's uncertainty regarding stimulus events, in either an acoustic or a visual oddball recognition task. The P300-based BCI system enables users to communicate messages from a set of choices by selecting a target character or icon that conveys a desired intent or action. P300-based BCIs have been widely researched as a communication alternative, especially in individuals with ALS who represent a target BCI user population. For the P300-based BCI, repeated data measurements are required to enhance the low signal-to-noise ratio of the elicited ERPs embedded in electroencephalography (EEG) data, in order to improve the accuracy of the target character estimation process. As a result, BCIs have relatively slower speeds when compared to other commercial assistive communication devices, and this limits BCI adoption by their target user population. The goal of this research is to develop algorithms that take into account the physical limitations of the target BCI population to improve the efficiency of ERP-based spellers for real-world communication.
In this work, it is hypothesised that building adaptive capabilities into the BCI framework can potentially give the BCI system the flexibility to improve performance by adjusting system parameters in response to changing user inputs. The research in this work addresses three potential areas for improvement within the P300 speller framework: information optimisation, target character estimation and error correction. The visual interface and its operation control the method by which the ERPs are elicited through the presentation of stimulus events. The parameters of the stimulus presentation paradigm can be modified to modulate and enhance the elicited ERPs. A new stimulus presentation paradigm is developed in order to maximise the information content that is presented to the user by tuning stimulus paradigm parameters to positively affect performance. Internally, the BCI system determines the amount of data to collect and the method by which these data are processed to estimate the user's target character. Algorithms that exploit language information are developed to enhance the target character estimation process and to correct erroneous BCI selections. In addition, a new model-based method to predict BCI performance is developed, an approach which is independent of stimulus presentation paradigm and accounts for dynamic data collection. The studies presented in this work provide evidence that the proposed methods for incorporating adaptive strategies in the three areas have the potential to significantly improve BCI communication rates, and the proposed method for predicting BCI performance provides a reliable means to pre-assess BCI performance without extensive online testing.
Item Open Access An information-theoretic analysis of spike processing in a neuroprosthetic model(2007-05-03T18:53:57Z) Won, Deborah S.Neural prostheses are being developed to provide motor capabilities to patients who suffer from motor-debilitating diseases and conditions. These brain-computer interfaces (BCI) will be controlled by activity from the brain and bypass damaged parts of the spinal cord or peripheral nervous system to re-establish volitional control of motor output. Spike sorting is a technologically expensive component of the signal processing chain required to interpret population spike activity acquired in a BCI. No systematic analysis of the need for spike sorting has been carried out and little is known about the effects of spike sorting error on the ability of a BCI to decode intended motor commands. We developed a theoretical framework and a modelling environment to examine the effects of spike processing on the information available to a BCI decoder. Shannon information theory was applied to simulated neural data. Results demonstrated that reported amounts of spike sorting error reduce mutual information (MI) significantly in single-unit spike trains. These results prompted investigation into how much information is available in a cluster of pooled signals. Indirect information analysis revealed the conditions under which pooled multi-unit signals can maintain the MI that is available in the corresponding sorted signals and how the information loss grows with dissimilarity of MI among the pooled responses. To reveal the differences in non-sorted spike activity within the context of a BCI, we simulated responses of 4 neurons with the commonly observed and exploited cosine-tuning property and with varying levels of sorting error. Tolerances of angular tuning differences and spike sorting error were given for MI loss due to pooling under various conditions, such as cases of inter- and/or intra-electrode differences and combinations of various mean firing rates and tuning depths. These analyses revealed the degree to which mutual information loss due to pooling spike activity depended upon differences in tuning between pooled neurons and the amount of spike error introduced by sorting. The theoretical framework and computational tools presented in this dissertation will BCI system designers to make decisions with an understanding of the tradeoffs between a system with and without spike sorting.Item Open Access An Information-Theoretic Analysis of X-Ray Architectures for Anomaly Detection(2018) Coccarelli, David ScottX-ray scanning equipment currently establishes a first line of defense in the aviation security space. The efficacy of these scanners is crucial to preventing the harmful use of threatening objects and materials. In this dissertation, I introduce a principled approach to the analyses of these systems by exploring performance limits of system architectures and modalities. Moreover, I validate the use of simulation as a design tool with experimental data as well as extend the use of simulation to create high-fidelity realizations of a real-world system measurements.
Conventional performance analysis of detection systems confounds the effects of the system architecture (sources, detectors, system geometry, etc.) with the effects of the detection algorithm. We disentangle the performance of the system hardware and detection algorithm so as to focus on analyzing the performance of just the system hardware. To accomplish this, we introduce an information-theoretic approach to this problem. This approach is based on a metric derived from Cauchy-Schwarz mutual information and is analogous to the channel capacity concept from communications engineering. We develop and utilize a framework that can produce thousands of system simulations representative of a notional baggage ensembles. These simulations and the prior knowledge of the virtual baggage allow us to analyze the system as it relays information pertinent to a detection task.
In this dissertation, I discuss the application of this information-theoretic approach to study variations of X-ray transmission architectures as well as novel screening systems based on X-ray scatter and phase. The results show how effective use of this metric can impact design decisions for X-ray systems. Moreover, I introduce a database of experimentally acquired X-ray data both as a means to validate the simulation approach and to produce a database ripe for further reconstruction and classification investigations. Next, I show the implementation of improvements to the ensemble representation in the information-theoretic material model. Finally I extend the simulation tool toward high-fidelity representation of real-world deployed systems.
Item Open Access Computer Aided Detection of Masses in Breast Tomosynthesis Imaging Using Information Theory Principles(2008-09-18) Singh, SwateeBreast cancer screening is currently performed by mammography, which is limited by overlying anatomy and dense breast tissue. Computer aided detection (CADe) systems can serve as a double reader to improve radiologist performance. Tomosynthesis is a limited-angle cone-beam x-ray imaging modality that is currently being investigated to overcome mammography's limitations. CADe systems will play a crucial role to enhance workflow and performance for breast tomosynthesis.
The purpose of this work was to develop unique CADe algorithms for breast tomosynthesis reconstructed volumes. Unlike traditional CADe algorithms which rely on segmentation followed by feature extraction, selection and merging, this dissertation instead adopts information theory principles which are more robust. Information theory relies entirely on the statistical properties of an image and makes no assumptions about underlying distributions and is thus advantageous for smaller datasets such those currently used for all tomosynthesis CADe studies.
The proposed algorithm has two 2 stages (1) initial candidate generation of suspicious locations (2) false positive reduction. Images were accrued from 250 human subjects. In the first stage, initial suspicious locations were first isolated in the 25 projection images per subject acquired by the tomosynthesis system. Only these suspicious locations were reconstructed to yield 3D Volumes of Interest (VOI). For the second stage of the algorithm false positive reduction was then done in three ways: (1) using only the central slice of the VOI containing the largest cross-section of the mass, (2) using the entire volume, and (3) making decisions on a per slice basis and then combining those decisions using either a linear discriminant or decision fusion. A 92% sensitivity was achieved by all three approaches with 4.4 FPs / volume for approach 1, 3.9 for the second approach and 2.5 for the slice-by-slice based algorithm using decision fusion.
We have therefore developed a novel CADe algorithm for breast tomosynthesis. The techniques uses an information theory approach to achieve very high sensitivity for cancer detection while effectively minimizing false positives.
Item Open Access FUNDAMENTAL LIMITS FOR COMMUNITY DETECTION IN LABELLED NETWORKS(2020) Mayya, Vaishakhi SathishThe problem of detecting the community structure of networks as well as closely related problems involving low-rank matrix factorization arise in applications throughout science and engineering. This dissertation focuses on the the fundamental limits of detection and recovery associated with a broad class of probabilistic network models, that includes the stochastic block model with labeled-edges. The main theoretical results are formulas that describe the asymptotically exact limits of the mutual information and reconstruction error. The formulas are described in terms of low-dimensional estimation problems in additive Gaussian noise.
The analysis builds upon a number of recent theoretical advances at the interface of information theory, random matrix theory, and statistical physics, including concepts such as channel universality and interpolation methods. The theoretical formulas provide insight into the ability to recover the community structure in the network. The analysis is supported by numerical simulations. Numerical simulations for different network models show that the observed performance closely follows the performance predicted by the formulas.
Item Open Access Improving Natural Language Understanding via Contrastive Learning Methods(2021) Cheng, PengyuNatural language understanding (NLU) is an essential but challenging task in Natural Language Processing (NLP), aiming to automatically extract and understand the semantic information from raw text or voice data. Among the previous NLU solutions, representation learning methods have recently become the mainstream, which maps textual data into low-dimensional vector spaces for downstream tasks. With the development of deep neural networks, text representation learning has achieved state-of-the-art performance on plenty of NLP scenarios.
Although text representation learning methods with large-scale network encoders have shown significant empirical gains, many essential properties of the text encoders remain unexplored, which hinders models' further application into real-world scenarios: (1) the high computational complexity of the large-scale deep networks limits text encoders to be applied on a broader range of devices, especially on low calculation-ability resources; (2) the mechanic of networks is agnostic, limiting the control of the latent representations for downstream tasks; (3) representation learning methods are data-driven, lead to inherent social bias problems with unbalanced data.
To address the problems above in deep text encoders, I proposed a series of effective contrastive learning methods, which supervise the encoders by enlarging the difference between positive and negative data sample pairs. In this thesis, I first present a theoretical contrastive learning tool, which bridges the contrastive learning methods and the mutual information in information theory. Then, I apply contrastive learning into several NLU scenarios to improve the text encoders' effectiveness, interpretability, and fairness.
Item Open Access Information-driven Sensor Path Planning and the Treasure Hunt Problem(2008-04-25) Cai, ChenghuiThis dissertation presents a basic information-driven sensor management problem, referred to as treasure hunt, that is relevant to mobile-sensor applications such as mine hunting, monitoring, and surveillance. The objective is to classify/infer one or multiple fixed targets or treasures located in an obstacle-populated workspace by planning the path and a sequence of measurements of a robotic sensor installed on a mobile platform associated with the treasures distributed in the sensor workspace. The workspace is represented by a connectivity graph, where each node represents a possible sensor deployment, and the arcs represent possible sensor movements. A methodology is developed for planning the sensing strategy of a robotic sensor deployed. The sensing strategy includes the robotic sensor's path, because it determines which targets are measurable given a bounded field of view. Existing path planning techniques are not directly applicable to robots whose primary objective is to gather sensor measurements. Thus, in this dissertation, a novel approximate cell-decomposition approach is developed in which obstacles, targets, the sensor's platform and field of view are represented as closed and bounded subsets of an Euclidean workspace. The approach constructs a connectivity graph with observation cells that is pruned and transformed into a decision tree, from which an optimal sensing strategy can be computed. It is shown that an additive incremental-entropy function can be used to efficiently compute the expected information value of the measurement sequence over time. The methodology is applied to a robotic landmine classification problem and the board game of CLUE$^{\circledR}$. In the landmine detection application, the optimal strategy of a robotic ground-penetrating radar is computed based on prior remote measurements and environmental information. Extensive numerical experiments show that this methodology outperforms shortest-path, complete-coverage, random, and grid search strategies, and is applicable to non-overpass capable platforms that must avoid targets as well as obstacles. The board game of CLUE$^{\circledR}$ is shown to be an excellent benchmark example of treasure hunt problem. The test results show that a player implementing the strategies developed in this dissertation outperforms players implementing Bayesian networks only, Q-learning, or constraint satisfaction, as well as human players.Item Open Access Locally Adaptive Protocols for Quantum State Discrimination(2021) Brandsen, SarahThis dissertation makes contributions to two rapidly developing fields: quantum information theory and machine learning. It has recently been demonstrated that reinforcement learning is an effective tool for a wide variety of tasks in quantum information theory, ranging from quantum error correction to quantum control to preparation of entangled states. In this work, we demonstrate that reinforcement learning is additionally highly effective for the task of multiple quantum hypothesis testing.
Quantum hypothesis testing consists of finding the quantum measurement which allows one to discriminate with minimal error between $m$ possible states $\{\rho_{k}\}|_{k=1}^{m}$ of a quantum system with corresponding prior probabilities $p_{k} = \text{Pr}[\rho = \rho_{k}]$. In the general case, although semi-definite programming offers a way to numerically approximate the optimal solution~\cite{Eldar_Semidefinite2}, a closed-form analytical solution for the optimal measurement is not known.
Additionally, when the quantum system is large and consists of many subsystems, the optimal measurement may be experimentally difficult to implement. In this work, we provide a comprehensive study of locally adaptive approaches to quantum hypothesis testing where only a single subsystem is measured at a time and the order and types of measurements implemented may depend on previous measurement results. Thus, these locally adaptive protocols present an experimentally feasible approach to quantum state discrimination.
We begin with the case of binary hypothesis testing (where $m=2$), and generalize previous work by Acin et al. (Phys. Rev. A 71, 032338) to show that a simple Bayesian-updating scheme can optimally distinguish between any pair of arbitrary pure, tensor product quantum states. We then demonstrate that this same Bayesian-updating scheme has poor asymptotic behaviour when the candidate states are not pure, and based on this we introduce a modified scheme with strictly better performance. Finally, a dynamic programming (DP) approach is used to find the optimal local protocol for binary state discrimination and numerical simulations are run for both qubit and qutrit subsystems.
Based on these results, we then turn to the more general case of multiple hypothesis testing where there may be several candidate states. Given that the dynamic-programming approach has a high complexity when there are a large number of subsystems, we turn to reinforcement learning methods to learn adaptive protocols for even larger systems. Our numerical results support the claim that reinforcement learning with neural networks (RLNN) is able to successfully find the optimal locally adaptive approach for up to 20 subsystems. We additionally find the optimal collective measurement through semidefinite programming techniques, and demonstrate that the RLNN approach meets or comes close to the optimal collective measurement in every random trial.
Next, we focus on quantum information theory and provide an operational interpretation for the entropy of a channel. This task is motivated by the central role of entropy across several areas of physics and science. We use games of chance as a more systematic and unifying approach to define entropy, as a system's performance in any game of chance depends solely on the uncertainty of the output. We construct families of games which result in a pre-order on channels and provide an operational interpretation for all pre-orders (corresponding to majorization, conditional majorization, and channel majorization respectively), and this defines the unique asymptotically continuous entropy function for classical channels.
Item Open Access Understanding the Diversity of Retinal Cell Types and Mosaic Organizations through Efficient Coding Theory(2022) Jun, Na YoungEfficient coding theory provides a powerful framework for understanding the organization of the early visual system. Prior research has demonstrated that efficient coding theory can help account for a range of retinal ganglion cell (RGC) organizational features, including the center-surround spatial receptive fields and ON and OFF parallel pathways. Here, we use a machine learning-based computational framework for efficient coding and show that more functional architecture of visual processing can be explained on the basis of this principle. First, how should receptive fields (RFs) be arranged to best encode natural images? When the spatial RFs and contrast response functions are optimized in order to maximally encode natural stimuli given noise and firing rate constraints, the RFs form a pair of mosaics, one with ON RFs and one with OFF RFs, similar to those of mammalian retina, as an existing finding from previous research. Interestingly, the relative arrangement of the two mosaics transitions between alignment under high signal-to-noise conditions and anti-alignment under low signal-to-noise conditions. The next question we tackled is: how are the ON and OFF RF mosaics arranged in the mammalian retina? We examined the retina of rats and primates and confirmed that the ON and OFF mosaic pairs encoding the same visual feature are anti-aligned, indicating that the retina is optimized to handle dim or low-contrast stimuli. Finally, we dove into the question: how many cell types can be predicted by efficient coding theory? We examined encoding of natural videos, and found that, as the available channel capacity – the number of simulated RGCs available for encoding – increases, new cell types emerge that focus on higher temporal frequencies and larger spatial areas. Together, these studies advance our understanding of the relationships between efficient coding, retinal organization, and diversity of retinal cell types.