Browsing by Subject "Anomaly detection"
- Results Per Page
- Sort Options
Item Open Access An Information-Theoretic Analysis of X-Ray Architectures for Anomaly Detection(2018) Coccarelli, David ScottX-ray scanning equipment currently establishes a first line of defense in the aviation security space. The efficacy of these scanners is crucial to preventing the harmful use of threatening objects and materials. In this dissertation, I introduce a principled approach to the analyses of these systems by exploring performance limits of system architectures and modalities. Moreover, I validate the use of simulation as a design tool with experimental data as well as extend the use of simulation to create high-fidelity realizations of a real-world system measurements.
Conventional performance analysis of detection systems confounds the effects of the system architecture (sources, detectors, system geometry, etc.) with the effects of the detection algorithm. We disentangle the performance of the system hardware and detection algorithm so as to focus on analyzing the performance of just the system hardware. To accomplish this, we introduce an information-theoretic approach to this problem. This approach is based on a metric derived from Cauchy-Schwarz mutual information and is analogous to the channel capacity concept from communications engineering. We develop and utilize a framework that can produce thousands of system simulations representative of a notional baggage ensembles. These simulations and the prior knowledge of the virtual baggage allow us to analyze the system as it relays information pertinent to a detection task.
In this dissertation, I discuss the application of this information-theoretic approach to study variations of X-ray transmission architectures as well as novel screening systems based on X-ray scatter and phase. The results show how effective use of this metric can impact design decisions for X-ray systems. Moreover, I introduce a database of experimentally acquired X-ray data both as a means to validate the simulation approach and to produce a database ripe for further reconstruction and classification investigations. Next, I show the implementation of improvements to the ensemble representation in the information-theoretic material model. Finally I extend the simulation tool toward high-fidelity representation of real-world deployed systems.
Item Open Access Anomaly-Detection and Health-Analysis Techniques for Core Router Systems(2018) Jin, ShiA three-layer hierarchy is typically used in modern telecommunication systems in order to achieve high performance and reliability. The three layers, namely core, distribution, and access, perform different roles for service fulfillment. The core layer is also referred to as the network backbone, and it is responsible for the transfer of a large amount of traffic in a reliable and timely manner. The network devices (such as routers) in the core layer are vulnerable to hard-to-detect/hard-to-recover errors. For example, the cards that constitute core router systems and the components that constitute a card can encounter hardware failures. Moreover, connectors between cards and interconnects between different components inside a card are also subject to hard faults. Also, since the performance requirement of network devices in the core layer is approaching Tbps levels, failures caused by subtle interactions between parallel threads or applications have become more frequent. All these different types of faults can cause a core router to become incapacitated, necessitating the design and implementation of fault-tolerant mechanisms in the core layer.
Proactive fault tolerance is a promising solution because it takes preventive action before a failure occurs. The state of the system is monitored in a real-time manner. When anomalies are detected, proactive repair actions such as job migration are executed to avoid errors, thereby maintaining the non-stop utilization of the entire system. The effectiveness of proactive fault-tolerance solutions depends on whether abnormal behaviors of core routers can be accurately pinpointed in a timely manner.
This dissertation first presents an anomaly detector for core router systems using correlation-based time series analysis. The proposed technique monitors a set of features obtained from a system deployed in the field. Various types of correlations among extracted features are identified. A set of features with minimum redundancy and maximum relevance are then grouped into different categories based on their statistical characteristics. A hybrid approach is developed to analyze various feature categories using a combination of different anomaly detection methods, leading to the detection of realistic anomalies.
Next, this dissertation presents the design of a changepoint-based anomaly detector such that anomaly detection can be adaptive to changes in the statistical features of data streams. The proposed method first detects changepoints from collected time-series data, and then utilizes these changepoints to detect anomalies. A clustering method is developed to identify a wide range of the normal/abnormal patterns from changepoint windows. Experimental results show that changepoint-based anomaly detector can detect outliers even when the statistical properties of the monitored data change significantly with time.
An efficient data-driven anomaly detector is not adequate to obtain a full picture of the health status of monitored core routers. It is also essential to learn how healthy a core router system is and how different task scenarios can affect the system. Therefore, this dissertation presents a symbol-based health status analyzer that first encodes, as a symbol sequence, the long-term complex time series collected from a number of core routers, and then utilizes the symbol sequence for health analysis. Symbol-based clustering and classification methods are developed to identify the health status.
In order to accurately identify the health status, historical operation data needs to be fully labeled, which is a challenge in the early stages of monitoring. Therefore, this dissertation presents an iterative self-learning procedure for assessing the health status. This procedure first computes a representative feature matrix to capture different characteristics of time-series data. Hierarchical clustering is then utilized to infer labels for the unlabeled dataset. Finally, a classifier is built and iteratively updated using both labeled and unlabeled datasets. Partially-labeled field data collected from a set of commercial core routers are used to experimentally validate the proposed method.
In summary, the dissertation tackles important problems of anomaly detection and health status analysis in complex core router systems. The results emerging from this dissertation provide the first comprehensive set of data-driven resiliency solutions for core router systems. It is anticipated that other high-performance computing systems will also benefit from this framework.