Development of a web service for analysis in a distributed network.

Thumbnail Image




Jiang, X
Wu, Y
Marsolo, K
Ohno-Machado, L

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats


Citation Stats


OBJECTIVE: We describe functional specifications and practicalities in the software development process for a web service that allows the construction of the multivariate logistic regression model, Grid Logistic Regression (GLORE), by aggregating partial estimates from distributed sites, with no exchange of patient-level data. BACKGROUND: We recently developed and published a web service for model construction and data analysis in a distributed environment. This recent paper provided an overview of the system that is useful for users, but included very few details that are relevant for biomedical informatics developers or network security personnel who may be interested in implementing this or similar systems. We focus here on how the system was conceived and implemented. METHODS: We followed a two-stage development approach by first implementing the backbone system and incrementally improving the user experience through interactions with potential users during the development. Our system went through various stages such as concept proof, algorithm validation, user interface development, and system testing. We used the Zoho Project management system to track tasks and milestones. We leveraged Google Code and Apache Subversion to share code among team members, and developed an applet-servlet architecture to support the cross platform deployment. DISCUSSION: During the development process, we encountered challenges such as Information Technology (IT) infrastructure gaps and limited team experience in user-interface design. We figured out solutions as well as enabling factors to support the translation of an innovative privacy-preserving, distributed modeling technology into a working prototype. CONCLUSION: Using GLORE (a distributed model that we developed earlier) as a pilot example, we demonstrated the feasibility of building and integrating distributed modeling technology into a usable framework that can support privacy-preserving, distributed data analysis among researchers at geographically dispersed institutes.






Published Version (Please cite this version)


Publication Info

Jiang, X, Y Wu, K Marsolo and L Ohno-Machado (2014). Development of a web service for analysis in a distributed network. EGEMS (Wash DC), 2(1). p. 1053. 10.13063/2327-9214.1053 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Yuan Wu

Associate Professor in Biostatistics & Bioinformatics

Survival analysis, Sequential clinical trial design, Machine learning, Causal inference, Non/Semi-parametric method, Statistical computing


Keith Allen Marsolo

Professor in Population Health Sciences

Dr. Marsolo is a faculty member in the Department of Population Health Sciences (DPHS) and a member of the Duke Clinical Research Institute (DCRI).  His current research focuses on infrastructure to support the use of electronic health records (EHRs) and other real-world data sources in observational and comparative effectiveness research and public health surveillance, as well as standards and architectures for multi-center learning health systems.  He serves as faculty advisor to the DPHS DataShare Shared Facility and faculty lead for the Pragmatic Health Services Research (PHSR) functional group within the DCRI.  Dr. Marsolo received his PhD in Computer Science from The Ohio State University, with a dissertation on data mining, specifically the modeling and classification of biomedical data. 

Prior to joining DPHS, Dr. Marsolo was an an Associate Professor in the Division of Biomedical Informatics (BMI) at Cincinnati Children’s Hospital Medical Center (CCHMC). While at CCHMC, Dr. Marsolo served as faculty advisor for BMI Data Services, a shared facility that supported distributed data sharing networks and also developed registry platforms to support learning networks. These included a configurable system for capturing summary or practice-level measures, and a “data-in-once” architecture that allowed information to be collected in the EHR and then be automatically transferred to a registry in order to support chronic care management, quality improvement and research.

Area of Expertise: Informatics, Data Quality, Common Data Models, Data Standards and Data Harmonization

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.