The Influence of Structural Information on Natural Language Processing

Loading...

Date

2020

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

228
views
348
downloads

Abstract

Learning effective and efficient vectoral representations for text has been a core problem for many downstream tasks in natural language processing (NLP).

Most traditional NLP approaches learn a text representation by only modeling the text itself.

Recently, researchers have discovered that some structural information associated with the texts can also be used to learn richer text representations.

In this dissertation, I will present my recent contributions on how to utilize various structural information including graphical networks, syntactic trees, knowledge graphs and implicit label dependencies to improve the model performances for different NLP tasks.

This dissertation consists of three main parts.

In the first part, I show that the semantic relatedness between different texts, represented by textual networks adding edges between correlated text vertices, can help with text embedding.

The proposed DMTE model embeds each vertex with a diffusion convolution operation applied on text inputs such that the complete level of connectivity between any two texts in the graph can be measured.

In the second part, I introduce the syntax-infused variational autoencoders (SIVAE) which jointly encode a sentence and its syntactic tree into two latent spaces and decode them simultaneously.

Sentences generated by this VAE-based framework are more grammatical and fluent, demonstrating the effectiveness of incorporating syntactic trees on language modeling.

In the third part, I focus on modeling the implicit structures of label dependencies for a multi-label medical text classification problem.

The proposed convolutional residual model successfully discovers label correlation structures and hence improves the multi-label classification results.

From the experimental results of proposed models, we can conclude that leveraging some structural information can contribute to better model performances.

It is essential to build a connection between the chosen structure and a specific NLP task.

Description

Provenance

Subjects

Computer science, Statistics, Linguistics, Deep learning, graphical networks, knowledge graphs, label dependencies, Natural language processing, syntactic trees

Citation

Citation

Zhang, Xinyuan (2020). The Influence of Structural Information on Natural Language Processing. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/20854.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.