Structure and Feedback-based Natural Language Processing
Date
2022
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
The development of deep learning models has revolutionized the way information is processed by computers and has made significant advancements in fields such as speech, vision, and language. In language, giant strides have been made ranging from seq2seq models that process one word at a time to more sophisticated Transformer networks that can feed on paragraphs of text. Due to their ability to generate coherent and meaningful sentences at scale, natural language processing (NLP) models have become so prevalent. Despite their effectiveness, these models often have room for improvement when presented with additional linguistic information. In this dissertation, I discuss my contributions to the use of (a) structural information, namely syntax, and numeric structures, often overlooked and underutilized by language models, and (b) feedback-based models in which the objectives of the main model are guided by feedback from a supporting model.
In the first part, I will present three of my contributions which explore the use of structural information to develop effective NLP models that outperform their baselines on tasks that require encoders and decoders, such as machine translation, as well as downstream tasks, such as text classification, question answering, fill-in-the-blanks, etc. The first contribution proposes techniques for consuming syntactic information such as part of speech, word position, and case in order to improve the performance of machine translation on data-heavy Transformer models. An accompanying case study compares and contrasts a seq2seq model with a Transformer in its ability to absorb syntax across many language pairs. The second and third contributions concern utilizing a numeric structure that is prevalent in languages, as a means of incorporating numeral reasoning into language models. Collectively, these contributions contribute to the improvement of translation performance, numerical question-answer reasoning, and other downstream tasks.
In the second part, I will present my two contributions that utilize feedback signals from a supporting model to achieve an optimization objective that enhances the performance of the main model. The first contribution deals with the main model as a multi-task model that performs language inference across multiple task languages, whereas the supporting model uses reinforcement learning to ascertain the importance of each task that is not known apriori. Based on this approach, the resultant mix of tasks led to significant improvements in the performance of the target language task. In the second contribution, a supporting model is used to select tokens that will most likely be out-of-distribution (OOD) tokens by using Mahalanobis distance and performing a technique known to language models as self-supervision. Using a novel regularization loss, the distance between in-domain tokens and pseudo-OOD tokens is maximized, which results in significant performance improvements when detecting OODs.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Sundararaman, Dhanasekar (2022). Structure and Feedback-based Natural Language Processing. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/26795.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.