Comparison of non-parametric methods for ungrouping coarsely aggregated data.
Abstract
BACKGROUND: Histograms are a common tool to estimate densities non-parametrically.
They are extensively encountered in health sciences to summarize data in a compact
format. Examples are age-specific distributions of death or onset of diseases grouped
in 5-years age classes with an open-ended age group at the highest ages. When histogram
intervals are too coarse, information is lost and comparison between histograms with
different boundaries is arduous. In these cases it is useful to estimate detailed
distributions from grouped data. METHODS: From an extensive literature search we identify
five methods for ungrouping count data. We compare the performance of two spline interpolation
methods, two kernel density estimators and a penalized composite link model first
via a simulation study and then with empirical data obtained from the NORDCAN Database.
All methods analyzed can be used to estimate differently shaped distributions; can
handle unequal interval length; and allow stretches of 0 counts. RESULTS: The methods
show similar performance when the grouping scheme is relatively narrow, i.e. 5-years
age classes. With coarser age intervals, i.e. in the presence of open-ended age groups,
the penalized composite link model performs the best. CONCLUSION: We give an overview
and test different methods to estimate detailed distributions from grouped count data.
Health researchers can benefit from these versatile methods, which are ready for use
in the statistical software R. We recommend using the penalized composite link model
when data are grouped in wide age classes.
Type
Journal articlePermalink
https://hdl.handle.net/10161/14647Published Version (Please cite this version)
10.1186/s12874-016-0157-8Publication Info
Rizzi, Silvia; Thinggaard, Mikael; Engholm, Gerda; Christensen, Niels; Johannesen,
Tom Børge; Vaupel, James W; & Lindahl-Jacobsen, Rune (2016). Comparison of non-parametric methods for ungrouping coarsely aggregated data. BMC Med Res Methodol, 16. pp. 59. 10.1186/s12874-016-0157-8. Retrieved from https://hdl.handle.net/10161/14647.This is constructed from limited available data and may be imprecise. To cite this
article, please review & use the official citation provided by the journal.
Collections
More Info
Show full item recordScholars@Duke
James Walton Vaupel
Research Professor Emeritus in the Sanford School of Public Policy
This author no longer has a Scholars@Duke profile, so the information shown here reflects
their Duke status at the time this item was deposited.

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info