Skip to main content
Duke University Libraries
DukeSpace Scholarship by Duke Authors
  • Login
  • Ask
  • Menu
  • Login
  • Ask a Librarian
  • Search & Find
  • Using the Library
  • Research Support
  • Course Support
  • Libraries
  • About
View Item 
  •   DukeSpace
  • Theses and Dissertations
  • Masters Theses
  • View Item
  •   DukeSpace
  • Theses and Dissertations
  • Masters Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Generalized and Scalable Optimal Sparse Decision Trees

Thumbnail
View / Download
1.5 Mb
Date
2020
Author
Zhong, Chudi
Advisor
Rudin, Cynthia
Repository Usage Stats
373
views
439
downloads
Abstract

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find \textit{optimal} decision trees. These new techniques have the potential to trigger a paradigm shift where it is possible to construct sparse decision trees to efficiently optimize a variety of objective functions without relying on greedy splitting and pruning heuristics that often lead to suboptimal solutions. The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables. We present techniques that produce optimal decision trees over a variety of objectives including F-score, AUC, and partial area under the ROC convex hull. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables and speeds up decision tree construction by several orders of magnitude relative to the state-of-the art.

Description
Master's thesis
Type
Master's thesis
Department
Statistical Science
Subject
Statistics
Computer science
Decision Trees
Interpretable Model
Operations Research
Optimization
Permalink
https://hdl.handle.net/10161/20774
Citation
Zhong, Chudi (2020). Generalized and Scalable Optimal Sparse Decision Trees. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/20774.
Collections
  • Masters Theses
More Info
Show full item record
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

Rights for Collection: Masters Theses


Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info

Make Your Work Available Here

How to Deposit

Browse

All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

My Account

LoginRegister

Statistics

View Usage Statistics
Duke University Libraries

Contact Us

411 Chapel Drive
Durham, NC 27708
(919) 660-5870
Perkins Library Service Desk

Digital Repositories at Duke

  • Report a problem with the repositories
  • About digital repositories at Duke
  • Accessibility Policy
  • Deaccession and DMCA Takedown Policy

TwitterFacebookYouTubeFlickrInstagramBlogs

Sign Up for Our Newsletter
  • Re-use & Attribution / Privacy
  • Harmful Language Statement
  • Support the Libraries
Duke University