PenPC: A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs.
| dc.contributor.author | Ha, Min Jin | |
| dc.contributor.author | Sun, Wei | |
| dc.contributor.author | Xie, Jichun | |
| dc.coverage.spatial | United States | |
| dc.date.accessioned | 2015-11-05T02:09:25Z | |
| dc.date.issued | 2016-03 | |
| dc.description.abstract | Estimation of the skeleton of a directed acyclic graph (DAG) is of great importance for understanding the underlying DAG and causal effects can be assessed from the skeleton when the DAG is not identifiable. We propose a novel method named PenPC to estimate the skeleton of a high-dimensional DAG by a two-step approach. We first estimate the nonzero entries of a concentration matrix using penalized regression, and then fix the difference between the concentration matrix and the skeleton by evaluating a set of conditional independence hypotheses. For high-dimensional problems where the number of vertices p is in polynomial or exponential scale of sample size n, we study the asymptotic property of PenPC on two types of graphs: traditional random graphs where all the vertices have the same expected number of neighbors, and scale-free graphs where a few vertices may have a large number of neighbors. As illustrated by extensive simulations and applications on gene expression data of cancer patients, PenPC has higher sensitivity and specificity than the state-of-the-art method, the PC-stable algorithm. | |
| dc.identifier | ||
| dc.identifier.eissn | 1541-0420 | |
| dc.identifier.uri | ||
| dc.language | eng | |
| dc.publisher | Wiley | |
| dc.relation.ispartof | Biometrics | |
| dc.relation.isversionof | 10.1111/biom.12415 | |
| dc.subject | DAG | |
| dc.subject | High dimensional | |
| dc.subject | Log penalty | |
| dc.subject | PC-algorithm | |
| dc.subject | Penalized regression | |
| dc.subject | Skeleton | |
| dc.subject | Biomarkers, Tumor | |
| dc.subject | Breast Neoplasms | |
| dc.subject | Computer Simulation | |
| dc.subject | Data Interpretation, Statistical | |
| dc.subject | Female | |
| dc.subject | Gene Expression Profiling | |
| dc.subject | Genetic Markers | |
| dc.subject | Genetic Predisposition to Disease | |
| dc.subject | Humans | |
| dc.subject | Models, Statistical | |
| dc.subject | Neoplasm Proteins | |
| dc.subject | Prevalence | |
| dc.subject | Reproducibility of Results | |
| dc.subject | Risk Factors | |
| dc.subject | Sensitivity and Specificity | |
| dc.title | PenPC: A two-step approach to estimate the skeletons of high-dimensional directed acyclic graphs. | |
| dc.type | Journal article | |
| duke.contributor.orcid | Xie, Jichun|0000-0001-5905-6728 | |
| pubs.author-url | ||
| pubs.begin-page | 146 | |
| pubs.end-page | 155 | |
| pubs.issue | 1 | |
| pubs.organisational-group | Basic Science Departments | |
| pubs.organisational-group | Biostatistics & Bioinformatics | |
| pubs.organisational-group | Duke | |
| pubs.organisational-group | School of Medicine | |
| pubs.publication-status | Published | |
| pubs.volume | 72 |
Files
Original bundle
- Name:
- PenPC.pdf
- Size:
- 990.29 KB
- Format:
- Adobe Portable Document Format
- Description:
- Accepted version