Structure and Feedback-based Natural Language Processing

Sundararaman, Dhanasekar

Structure and Feedback-based Natural Language Processing

View / Download3 MB

Date

2022

Authors

Sundararaman, Dhanasekar

Advisors

Carin, Lawrence

Repository Usage Stats

15
views

43
downloads

Abstract

The development of deep learning models has revolutionized the way information is processed by computers and has made significant advancements in fields such as speech, vision, and language. In language, giant strides have been made ranging from seq2seq models that process one word at a time to more sophisticated Transformer networks that can feed on paragraphs of text. Due to their ability to generate coherent and meaningful sentences at scale, natural language processing (NLP) models have become so prevalent. Despite their effectiveness, these models often have room for improvement when presented with additional linguistic information. In this dissertation, I discuss my contributions to the use of (a) structural information, namely syntax, and numeric structures, often overlooked and underutilized by language models, and (b) feedback-based models in which the objectives of the main model are guided by feedback from a supporting model.

In the first part, I will present three of my contributions which explore the use of structural information to develop effective NLP models that outperform their baselines on tasks that require encoders and decoders, such as machine translation, as well as downstream tasks, such as text classification, question answering, fill-in-the-blanks, etc. The first contribution proposes techniques for consuming syntactic information such as part of speech, word position, and case in order to improve the performance of machine translation on data-heavy Transformer models. An accompanying case study compares and contrasts a seq2seq model with a Transformer in its ability to absorb syntax across many language pairs. The second and third contributions concern utilizing a numeric structure that is prevalent in languages, as a means of incorporating numeral reasoning into language models. Collectively, these contributions contribute to the improvement of translation performance, numerical question-answer reasoning, and other downstream tasks.

In the second part, I will present my two contributions that utilize feedback signals from a supporting model to achieve an optimization objective that enhances the performance of the main model. The first contribution deals with the main model as a multi-task model that performs language inference across multiple task languages, whereas the supporting model uses reinforcement learning to ascertain the importance of each task that is not known apriori. Based on this approach, the resultant mix of tasks led to significant improvements in the performance of the target language task. In the second contribution, a supporting model is used to select tokens that will most likely be out-of-distribution (OOD) tokens by using Mahalanobis distance and performing a technique known to language models as self-supervision. Using a novel regularization loss, the distance between in-domain tokens and pseudo-OOD tokens is maximized, which results in significant performance improvements when detecting OODs.

Type

Dissertation

Department

Electrical and Computer Engineering

Subjects

Computer engineering

Permalink

https://hdl.handle.net/10161/26795

Citation

Sundararaman, Dhanasekar (2022). Structure and Feedback-based Natural Language Processing. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/26795.

Collections

Dissertations

Full item page

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.

Structure and Feedback-based Natural Language Processing

Date

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

Abstract

Type

Department

Description

Provenance

Subjects

Citation

Permalink

Citation

Collections