Browsing by Subject "Quantization"
Results Per Page
Sort Options
Item Open Access Efficient and Scalable Deep Learning(2019) Wen, WeiDeep Neural Networks (DNNs) can achieve accuracy superior to traditional machine learning models, because of their large learning capacity and the availability of large amounts of labeled data. In general, larger DNNs can obtain higher accuracy. However, there are two obstacles which hinder us building larger DNNs: (1) inference of large DNNs is slow which limits their deployment to small devices; (2) training large DNNs is also slow which slows down research exploration. To remove those obstacles, this dissertation focuses on acceleration of DNN inference and training. To accelerate DNN inference, original DNNs are compressed while keeping original accuracy. More specific, Structurally Sparse Deep Neural Networks (SSDNNs) are proposed to remove neural components. In Convolutional Neural Networks (CNNs), neurons, filters, channels and layers can be removed; in Recurrent Neural Networks (RNNs), hidden sizes can be reduced. The study shows that SSDNNs can achieve higher speedup than sparse DNNs which have non-structured sparsity. Besides SSDNNs, a Force Regularization is proposed to enforce DNNs to lower-rank space, such that DNNs can be decomposed to lower-rank architectures with fewer ranks than traditional methods. The dissertation also demonstrates that SSDNNs and Force Regularization are orthogonal and can be combined for higher speedup. To accelerate DNN training, distributed deep learning is required. However, two problems hinder us using more compute nodes for higher training speed: Communication Bottleneck and Generalization Gap. Communication Bottleneck is that communication time will increase and dominate when the distributed systems scale to many compute nodes. To reduce gradient communication in Stochastic Gradient Descent (SGD), SGD with low-precision gradients (TernGrad) is proposed. Moreover, in distributed deep learning, a large batch size is required to exploit system computing power; unfortunately, accuracy will decrease when the batch size is very large, which is referred to as the Generalization Gap. One hypothesis to explain Generalization Gap is that large-batch SGD sticks at sharp minima. The dissertation proposes a stochastic smoothing (SmoothOut) to escape sharp minima. The dissertation will show that TernGrad overcomes Communication Bottleneck and SmoothOut helps to close the Generalization Gap.
Item Open Access Robustness Analysis and Improvement in Neural Networks and Neuromorphic Computing(2021) Song, ChangDeep learning and neural networks have great potential while still at risk. The so-called adversarial attacks, which apply small perturbations on input samples to fool models, threaten the reliability of neural networks and their hardware counterparts, neuromorphic computing. To solve such issues, various attempts are made, including adversarial training and other data augmentation methods.In our early attempt to defend adversarial attacks, we propose a multi-strength adversarial training method to cover a wider effective range than typical single-strength adversarial training. Furthermore, we also propose two different structures in order to compensate for the tradeoff between the total training time and the hardware implementation cost. Experimental results show that our proposed method gives better accuracy than the baselines with tolerable additional hardware cost. To better understand robustness, we analyze the adversarial problem in the decision space. In one of our defense approaches called feedback learning, we theoretically prove the effectiveness of adversarial training and other data augmentation method. For empirical proof, we generate non-adversarial examples based on the information of the decision boundaries of neural networks and add these examples in training. The results show that the boundaries of the models are more robust to noises and perturbations after applying feedback learning than baselines. Besides algorithm-level concerns, we also focus on hardware implementations in quantization scenarios. We find that adversarially-trained neural networks are more vulnerable to quantization loss than plain models. To improve the robustness of hardware-based quantized models, we explore methods such as feedback learning, nonlinear mapping, and layer-wise quantization. Results show that the adversarial and quantization robustness can be improved by feedback learning and nonlinear mapping, respectively. But the accuracy gap introduced by quantization can be further minimized. To minimize both losses simultaneously, we also propose a layer-wise adversarial-aware quantization method to choose the best quantization parameter settings for adversarially-trained models. In this method, we use the Lipschitz constant of different layers as error sensitivity metrics and design several criteria to decide the quantization settings for each layer. The results show that our method can further minimize the accuracy gap between full-precision and quantized adversarially-trained models.