Skip to main content
Duke University Libraries
DukeSpace Scholarship by Duke Authors
  • Login
  • Ask
  • Menu
  • Login
  • Ask a Librarian
  • Search & Find
  • Using the Library
  • Research Support
  • Course Support
  • Libraries
  • About
View Item 
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
  •   DukeSpace
  • Theses and Dissertations
  • Duke Dissertations
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Simplifying System Management Through Automated Forecasting, Diagnosis, and Configuration Tuning

Thumbnail
View / Download
3.1 Mb
Date
2010
Author
Duan, Songyun
Advisor
Babu, Shivnath
Repository Usage Stats
413
views
528
downloads
Abstract

Large-scale networked computing systems are widely deployed to run business-critical applications in environments where changes are frequent. Manual management of these complex systems can be tedious and error-prone. Meanwhile, the high costs of application downtime make it critical to ensure system availability and reliability. Recent progress in monitoring tools enables system administrators to collect fine-grained data about system activity with low overhead. This data provides valuable information for system management. However, the monitoring data collected from production systems is massive in size and noisy; which makes it hard for system administrators to fully utilize this data for effective system management.

This dissertation describes a data-management platform, called Fa, where system administrators can pose declarative queries over system monitoring data. Fa automatically finds fairly accurate and efficient execution plans for given queries, and returns query results in easy-to-interpret formats. Fa supports three key query types, namely, forecasting queries (for predicting or detecting performance problems), diagnosis queries (for finding the cause of performance problems), and tuning queries (for recommending changes to system configuration to resolve diagnosed problems):

(a) For processing diagnosis queries, Fa constructs problem signatures from system monitoring data to identify recurrent problems and to reuse past diagnostic information. For a rare or new problem, Fa employs an anomaly-based clustering technique to generate performance baselines and to characterize the deviation from baselines to pinpoint root causes. Fa also incorporates an active-learning component that identifies diagnosis queries whose results, if provided or confirmed by system administrators, can be used to update problem signatures and to improve the accuracy and efficiency for processing future queries.

(b) For processing tuning queries to resolve problems caused by system misconfiguration, Fa employs an adaptive sampling algorithm that plans experiments to efficiently identify high-impact configuration parameters and high-performance settings. These experiments bring in information---required for generating accurate query results---that is missing in the monitoring data collected so far.

(c) For both one-time and continuous forecasting queries, Fa automatically searches for efficient execution plans in a large space of plans composed of data-transformation operators as well as synopsis-learning and prediction operators. Forecasting queries can be composed with diagnosis and tuning queries to enable proactive system management that avoids potential problems.

We have evaluated the Fa platform with monitoring data collected from database-backed multitier services, and with synthetic data that models the noisy nature of monitoring data from production systems. Our evaluation shows that Fa's query plan selection and execution strategies provide actionable information for system management automatically, accurately, and efficiently. Critical features like reliable confidence estimates, robustness to noise, and providing supporting evidence for query results make Fa a practical and useful platform.

Type
Dissertation
Department
Computer Science
Subject
Computer Science
Permalink
https://hdl.handle.net/10161/2399
Citation
Duan, Songyun (2010). Simplifying System Management Through Automated Forecasting, Diagnosis, and Configuration Tuning. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/2399.
Collections
  • Duke Dissertations
More Info
Show full item record
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

Rights for Collection: Duke Dissertations


Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info

Make Your Work Available Here

How to Deposit

Browse

All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

My Account

LoginRegister

Statistics

View Usage Statistics
Duke University Libraries

Contact Us

411 Chapel Drive
Durham, NC 27708
(919) 660-5870
Perkins Library Service Desk

Digital Repositories at Duke

  • Report a problem with the repositories
  • About digital repositories at Duke
  • Accessibility Policy
  • Deaccession and DMCA Takedown Policy

TwitterFacebookYouTubeFlickrInstagramBlogs

Sign Up for Our Newsletter
  • Re-use & Attribution / Privacy
  • Harmful Language Statement
  • Support the Libraries
Duke University