Utilizing Network Structure to Flexibly Model Areal Data

dc.contributor.advisor

Hoff, Peter D.

dc.contributor.author

Christensen, Michael Fredrick

dc.date.accessioned

2025-01-08T17:44:29Z

dc.date.available

2025-01-08T17:44:29Z

dc.date.issued

2024

dc.department

Statistical Science

dc.description.abstract

The majority of statistical methods for the analysis of spatially indexed data have been developed for settings in which data are observed at a collection of point-indexed locations. Region-indexed data, also known as areal data, are common in many disciplines including economics, sociology, public health, and ecology. While statistical methods methods for the analysis of spatially correlated, point-indexed data typically characterize covariance as a function of the distances between observed locations, methods for areal data tend to characterize between-region spatial dependence using a graphical representation of the spatial domain. In most cases this has been done by defining covariance as a function of the unweighted adjacency matrix of the data, resulting in the fairly rigid assumption that all pairs of neighboring regions interact in essentially the same manner. This dissertation presents multiple methods by which the intrinsic network structure of areal data may be utilized to define more flexible models for graph and areal data than those commonly used at present. In Chapter 2 we introduce the graph deformation (GDEF) framework which implicitly defines an embedding of a graph into high-dimensional Euclidean space, wherein between-node covariance is defined using the Mat\'{e}rn covariance function. The model is parameterized by an unknown edge weights matrix and defines a class of covariance matrices that are both highly flexible and interpretable. We compare the GDEF model to alternatives, illustrating the advantages of the approach, before concluding with an analysis of bird abundance data for several species in North Carolina. Chapter 3 extends the work presented in Chapter 2, proposing a method by which unknown edge weights matrices may be more efficiently estimated using a basis function representation. We show how this framework improves the GDEF model as well as other existing models for areal data. We provide several illustrations of the properties of the GDEF and other models when integrated with the proposed method, concluding with a number of simulation studies and a data analysis that demonstrate the utility of the GDEF framework. In Chapter 4 we explore how crowdsourced observations of migratory birds may be utilized to better understand avian migratory trends. We propose a novel hidden Markov model that draws from circuit theory and electrical physics to characterize its transition structure, and conclude by fitting our model to observations of the Baltimore oriole and yellow-rumped warbler within the eastern United States, with data spanning a five year period. We conclude the dissertation with a brief discussion of the work presented.

dc.identifier.uri

https://hdl.handle.net/10161/31921

dc.rights.uri

https://creativecommons.org/licenses/by-nc-nd/4.0/

dc.subject

Statistics

dc.subject

Areal Data

dc.subject

Circuit Theory

dc.subject

Covariance

dc.subject

Ecology

dc.subject

Graphs

dc.subject

Spatial Statistics

dc.title

Utilizing Network Structure to Flexibly Model Areal Data

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Christensen_duke_0066D_18137.pdf
Size:
3.08 MB
Format:
Adobe Portable Document Format

Collections