dc.description.abstract |
<p>Decision tree algorithms have been among the most popular algorithms for interpretable
(transparent) machine learning since the early 1980's. The problem that has plagued
decision tree algorithms since their inception is their lack of optimality, or lack
of guarantees of closeness to optimality: decision tree algorithms are often greedy
or myopic, and sometimes produce unquestionably suboptimal models. Hardness of decision
tree optimization is both a theoretical and practical obstacle, and even careful mathematical
programming approaches have not been able to solve these problems efficiently. This
work introduces the first practical algorithm for optimal decision trees for binary
variables. The algorithm is a co-design of analytical bounds that reduce the search
space and modern systems techniques, including data structures and a custom bit-vector
library. We highlight possible steps to improving the scalability and speed of future
generations of this algorithm based on insights from our theory and experiments.</p>
|
|