An Efficient Pseudo-likelihood Method for Sparse Binary Pairwise Markov Network Estimation
Abstract
The pseudo-likelihood method is one of the most popular algorithms for learning sparse binary pairwise Markov networks. In this paper, we formulate the $L_1$ regularized pseudo-likelihood problem as a sparse multiple logistic regression problem. In this way, many insights and optimization procedures for sparse logistic regression can be applied to the learning of discrete Markov networks. Specifically, we use the coordinate descent algorithm for generalized linear models with convex penalties, combined with strong screening rules, to solve the pseudo-likelihood problem with $L_1$ regularization. Therefore a substantial speedup without losing any accuracy can be achieved. Furthermore, this method is more stable than the node-wise logistic regression approach on unbalanced high-dimensional data when penalized by small regularization parameters. Thorough numerical experiments on simulated data and real world data demonstrate the advantages of the proposed method.
Type
Department
Description
Provenance
Citation
Permalink
Collections
Scholars@Duke
David Page
David Page works on algorithms for data mining and machine learning, as well as their applications to biomedical data, especially de-identified electronic health records and high-throughput genetic and other molecular data. Of particular interest are machine learning methods for complex multi-relational data (such as electronic health records or molecules as shown) and irregular temporal data, and methods that find causal relationships or produce human-interpretable output (such as the rules for molecular bioactivity shown in green to the side).
Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.