The Influence of Structural Information on Natural Language Processing
Date
2020
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Learning effective and efficient vectoral representations for text has been a core problem for many downstream tasks in natural language processing (NLP).
Most traditional NLP approaches learn a text representation by only modeling the text itself.
Recently, researchers have discovered that some structural information associated with the texts can also be used to learn richer text representations.
In this dissertation, I will present my recent contributions on how to utilize various structural information including graphical networks, syntactic trees, knowledge graphs and implicit label dependencies to improve the model performances for different NLP tasks.
This dissertation consists of three main parts.
In the first part, I show that the semantic relatedness between different texts, represented by textual networks adding edges between correlated text vertices, can help with text embedding.
The proposed DMTE model embeds each vertex with a diffusion convolution operation applied on text inputs such that the complete level of connectivity between any two texts in the graph can be measured.
In the second part, I introduce the syntax-infused variational autoencoders (SIVAE) which jointly encode a sentence and its syntactic tree into two latent spaces and decode them simultaneously.
Sentences generated by this VAE-based framework are more grammatical and fluent, demonstrating the effectiveness of incorporating syntactic trees on language modeling.
In the third part, I focus on modeling the implicit structures of label dependencies for a multi-label medical text classification problem.
The proposed convolutional residual model successfully discovers label correlation structures and hence improves the multi-label classification results.
From the experimental results of proposed models, we can conclude that leveraging some structural information can contribute to better model performances.
It is essential to build a connection between the chosen structure and a specific NLP task.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Zhang, Xinyuan (2020). The Influence of Structural Information on Natural Language Processing. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/20854.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.