An Informatics Toolkit for the Risk Assessment of Nano-scaled Materials and Contaminants
Date
2025
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Attention Stats
Abstract
A comprehensive assessment of the potential risks associated with nano-scaled materials requires researchers to marry traditional experimental and modeling methods with informatics tools. The combined effect of integrating traditional approaches with digital repositories and tools has already been realized in other fields, such as bioinformatics. Most notably, the human genome project resulted in the creation of a new field, genomics, when teams were faced with the daunting tasks of analyzing millions of lines of genetic code. The result was revolutionary to humanity and to science. A major benefit for the human genome project was that researchers were working toward a common goal. The researchers who seek to understand the complex ways engineered and incidental nano-scaled materials interact with biological and environmental systems come from diverse fields. Developing informatics tools requires translations across methods and technical jargon. This dissertation work is describes the process of developing an informatics toolkit while also using such tools to create a more synchronous community. A common theme that runs throughout this work is the process of determining the ways in which it is possible to reuse published data to address new and similar questions of an evolving field. In within five chapters of this work, a dataset containing over 35,000 datapoints was curated into the NanoInformatics Knowledge Commons (NIKC) database, which was uniquely designed to encapsulate the dynamic nature of nano-scaled materials within a static repository structure. I also describe the development of mapping methods which were used standardize the NIKC curation process, and how these mapping methods are being integrated into data managements plans. The main goal of any toolkit is its potential use to the scientific community; therefore, this dissertation, details the process of centering a community around data, and suggests ways in which Findable, Accessible, Interoperable, and Reusable (FAIR) data can be generated at earlier stages of the experimental and informatics process. The last couple of chapters of this dissertation narrows its focus to one assay, attachment efficiency or alpha, which is commonly used in modeling the fate of nano-scaled materials and contaminants in the environment. The theme of reusing data persists as a dataset of ~500 datapoints collated from literature are used train a random forest model and a develop a Bayesian network. The dataset is subsequently subsetted to examine particular segments of the dataset, decreasing model performance, as each subset reduces the datapoints. Finally, I create an app in which users can fit hydrology from the United States Geological Survey (USGS) and Environmental Protection Agency (EPA), to predict alpha values for future materials. The work expounded upon in this dissertation is a meaningful example in the ways diverse fields can take concordant steps forward in combining informatics into the risk assessment of nano-scaled materials and contaminants. This work can be adopted and expanded upon by the environment, health, and safety (EHS) community for the next generation of materials and contaminants.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Amos, Jaleesia D. (2025). An Informatics Toolkit for the Risk Assessment of Nano-scaled Materials and Contaminants. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/33321.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.
