Browsing by Subject "Identifiability"
- Results Per Page
- Sort Options
Item Open Access Extending Probabilistic Record Linkage(2020) Solomon, Nicole ChanelProbabilistic record linkage is the task of combining multiple data sources for statistical analysis by identifying records pertaining to the same individual in different databases. The need to perform probabilistic record linkage arises in comparative effectiveness research and other clinical research scenarios when records in different databases do not share an error-free unique patient identifier. This dissertation seeks to develop new methodology for probabilistic record linkage to address two highly practical and recurring challenges: how to implement record linkage in a manner that optimizes downstream statistical analyses of the linked data, and how to efficiently link databases having a clustered or multi-level data structure.
In Chapter 2 we propose a new framework for balancing the tradeoff between false positive and false negative linkage errors when linked data are analyzed in a generalized linear model framework and non-linked records lead to missing data for the study outcome variable. Our method seeks to maximize the probability that the point estimate of the parameter of interest will have the correct sign and that the confidence interval around this estimate will correctly exclude the null value of zero. Using large sample approximations and a model for linkage errors, we derive expressions relating bias and hypothesis testing power to the user's choice of threshold that determines how many records will be linked. We use these results to propose three data-driven threshold selection rules. Under one set of simplifying assumptions we prove that maximizing asymptotic power requires that the threshold be relaxed at least until the point where all pairs with >50% probability of being a true match are linked.
In Chapter 3 we explore the consequences of linkage errors when the study outcome variable is determined by linkage status and so linkage errors may cause outcome misclassification. This scenario arises when the outcome is disease status and those linked are classified as having the disease while those not linked are classified as disease-free. We assume the parameter of interest can be expressed as a linear combination of binomial proportions having mean zero under the null hypothesis. We derive an expression for the asymptotic relative efficiency of a Wald test calculated with a misclassified outcome compared to an error-free outcome using a linkage error model and large sample approximations. We use this expression to generate insights for planning and implementing studies using record linkage.
In Chapter 4 we develop a modeling framework for linking files with a clustered data structure. Linking such clustered data is especially challenging when error-free identifiers are unavailable for both individual-level and cluster-level units. The proposed approach improves over current methodology by modeling inter-pair dependencies in clustered data and producing collective link decisions. It is novel in that it models both record attributes and record relationships, and resolves match statuses for individual-level and cluster-level units simultaneously. We show that linkage probabilities can be estimated without labeled training data using assumptions that are less restrictive compared to existing record linkage models. Using Monte Carlo simulations based on real study data, we demonstrate its advantages over the current standard method.
Item Open Access Homeostasis-Bifurcation Singularities and Identifiability of Feedforward Networks(2020) Duncan, WilliamThis dissertation addresses two aspects of dynamical systems arising from biological networks: homeostasis-bifurcation and identifiability.
Homeostasis occurs when a biological quantity does not change very much as a parameter is varied over a wide interval. Local bifurcation occurs when the multiplicity or stability of equilibria changes at a point. Both phenomena can occur simultaneously and as the result of a single mechanism. We show that this is the case in the feedback inhibition network motif. In addition we prove that longer feedback inhibition networks are less stable. Towards understanding interactions between homeostasis and bifurcations, we define a new type of singularity, the homeostasis-bifurcation point. Using singularity theory, the behavior of dynamical systems with homeostasis-bifurcation points is characterized. In particular, we show that multiple homeostatic plateaus separated by hysteretic switches and homeostatic limit cycle periods and amplitudes are common when these singularities occur.
Identifiability asks whether it is possible to infer model parameters from measurements. We characterize the structural identifiability properties for feedforward networks with linear reaction rate kinetics. Interestingly, the set of reaction rates corresponding to the edges of the graph are identifiable, but the assignment of rates to edges is not; Permutations of the reaction rates leads to the same measurements. We show how the identifiability results for linear kinetics can be extended to Michaelis-Menten kinetics using asymptotics.