Managing Shared Resources in the Data Center Era
dc.contributor.advisor | Lee, Benjamin C | |
dc.contributor.author | Zahedi, Seyed Majid | |
dc.date.accessioned | 2018-05-31T21:13:47Z | |
dc.date.available | 2018-05-31T21:13:47Z | |
dc.date.issued | 2018 | |
dc.department | Computer Science | |
dc.description.abstract | To improve efficiency and amortize cost over more computation, resource sharing has become vital in high performance computing systems. In such systems, the conventional wisdom assumes that users have to share, regardless of the management policy. With a wide range of computing options available, this assumption does not seem to hold for today's self-interested users. These users selfishly pursue their individual performance without regard for others or the system. And if they dislike management outcomes, they will withdraw from the shared system. If they decide to share, they will try to game the management system by misreporting their resource demands to improve their performance, perhaps at the expense of others in the system. To address this challenge and study strategic behavior of self-interested users, game theory is known to be an effective tool. Drawing on game theory, this thesis encourages new thinking in designing management platforms robust to strategic behavior. In this thesis, we present five pieces of work on data center management platforms. First, with the democratization of cloud and datacenter computing, users increasingly share large hardware platforms. In this setting, architects encounter two challenges: sharing fairly and sharing multiple resources. Drawing on game theory, we rethink fairness in computer architecture. A fair allocation must provide sharing incentives (SI), envy-freeness (EF), and Pareto efficiency (PE). We show that Cobb-Douglas utility functions are well suited to modeling user preferences for cache capacity and memory bandwidth. Additionally we present an allocation mechanism that uses Cobb-Douglas preferences to determine each user's fair share of the hardware. This mechanism provably guarantees SI, EF, and PE, as well as strategy-proofness in the large (SPL). And it does so with modest performance penalties, less than 10% throughput loss, relative to an unfair mechanism. Second, computational sprinting is a class of mechanisms that boost performance but dissipate additional power. We describe a sprinting architecture in which many independent chip multiprocessors share a power supply and sprints are constrained by the chips' thermal limits and the rack's power limits. Moreover, we present the computational sprinting game, a multi-agent perspective on managing sprints. Strategic agents decide whether to sprint based on application phases and system conditions. The game produces an equilibrium that improves task throughput for data analytics workloads by 4-6x over prior greedy heuristics and performs within 90% of an upper bound on throughput from a globally optimized policy. Third, ensuring fairness in a system with scarce and commonly preferred resources requires time sharing. We consider a heterogeneous system with a few ``big'' and many ``small'' processors. We allocate heterogeneous processors using a novel token mechanism that supports game-theoretic notions of fairness such as sharing incentives and envy-freeness. The mechanism frames the allocation problem as a repeated game. In each round of the game, users request big processors and spend a token if their request is granted. We formulate game dynamics and optimize users' strategies to produce an equilibrium. Allocations from optimal strategies balance performance and fairness. Our token mechanism outperforms classical, fair mechanisms by 1.7x, on average, in total performance gains, and is competitive with a performance maximizing mechanism. Fourth, we present a processor allocation framework that uses Amdahl's Law to model parallel performance and a market mechanism to allocate cores. We propose the Amdahl utility function and demonstrate its accuracy when modeling performance from processor core allocations. We then design a market based on Amdahl utility and propose the Amdahl bidding procedure that optimizes users' bids for processors based on workload parallelizability. The framework uses entitlements to guarantee fairness yet outperforms existing proportional share algorithms. Finally, sharing computational resources amortizes cost and improves utilization and efficiency. When agents pool their resources together, each becomes entitled to a portion of the shared pool. Static allocations in each round can guarantee entitlements and are strategy-proof, but efficiency suffers because allocations do not reflect variations in agents' demands for resources across rounds. Dynamic allocation mechanisms assign resources to agents across multiple rounds while guaranteeing agents their entitlements. Designing dynamic mechanisms is challenging, however, when agents are strategic and can benefit by misreporting their demands for resources. The Amdahl bidding mechanism facilitates the trade in resources between users with static demands within a single management round. To facilitate the trade in resources between users with dynamic demands across multiple rounds, we propose two novel mechanisms. First, the T-period mechanism satisfies strategy-proofness and sharing incentives but with low efficiency. Second, the token mechanism satisfies strategy-proofness and guarantees at least a 50% approximation of sharing incentives, which means users receive at least half the utility they would have received by not participating in the mechanism. Through simulations on data gathered from Google clusters, we show that the performance of the token mechanism is comparable to that of state-of-the-art mechanisms that do not guarantee our game-theoretic properties. Further, although the token mechanism only guarantees a 50% approximation of sharing incentives, in practice, users receive at least 98\% of their sharing incentives guarantee. | |
dc.identifier.uri | ||
dc.subject | Computer science | |
dc.title | Managing Shared Resources in the Data Center Era | |
dc.type | Dissertation |
Files
Original bundle
- Name:
- Zahedi_duke_0066D_14486.pdf
- Size:
- 3.02 MB
- Format:
- Adobe Portable Document Format