Bayesian Dynamic Modeling for Streaming Network Data

dc.contributor.advisor

West, Mike

dc.contributor.author

Chen, Xi

dc.date.accessioned

2017-05-16T17:27:59Z

dc.date.available

2017-05-16T17:27:59Z

dc.date.issued

2017

dc.department

Statistical Science

dc.description.abstract

Streaming network data of various forms arises in many applications, raising interest in research to model and quantify the nature of stochasticity and structure in dynamics underlying such data. One example context is that of traffic flow count data in networks, such as in automobile or aviation transportation, certain directed social network contexts, and Internet studies. Using an example of Internet browser traffic flows through site-segments of an international news website, I present Bayesian analyses of two new, linked classes of models which, in tandem, allow fast, scalable and interpretable Bayesian inference on dynamic patterns over time underlying flows. I develop two kinds of flexible state-space models for streaming count data, able to adaptively characterize and quantify network dynamics efficiently in real-time. These models are then used as emulators of more structured, time-varying gravity models that allow formal dissection of network dynamics. This yields interpretable inferences on traffic flow characteristics, and on dynamics in interactions among network nodes. Bayesian monitoring theory defines a strategy for sequential model assessment and adaptation in cases when network flow data deviates from model-based predictions. Exploratory and sequential monitoring analyses of evolving traffic on a network of web site-segments in e-commerce demonstrate the utility of this coupled Bayesian emulation approach to analysis of streaming network count data.

A second, different dynamic network context is that involving relational data. Examples include contexts of binary network data indicating communications or relationships between pairs of network nodes over time. Some popular examples include friendships over social networks and communications between different functional zones in brain. Using an example of co-movements of company stock indices, I develop and compare two different approaches. One involves latent threshold models mapping latent processes to binary entries via a probabilistic link function, a second involves dynamic generalized linear models for binary outcomes. Analyses implement using Markov chain Monte Carlo methods are available for these models, but naturally computationally demanding and not scalable to relevant network dimensions for many contexts. In contrast, dynamic generalized linear models can implemented using fast, effective approximate Bayesian computations for both sequential and retrospective analyses to enable linear-time computations. I also demonstrate the use of a model decoupling/recoupling strategy to enable scaling in network size.

dc.identifier.uri

https://hdl.handle.net/10161/14443

dc.subject

Statistics

dc.subject

Bayesian model emulation

dc.subject

Decouple/Recouple

dc.subject

Dynamic gravity model

dc.subject

Dynamic network flow model

dc.subject

Local linear growth model

dc.subject

Monitoring and anomaly detection

dc.title

Bayesian Dynamic Modeling for Streaming Network Data

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Chen_duke_0066D_13901.pdf
Size:
9.98 MB
Format:
Adobe Portable Document Format

Collections