Browsing by Subject "resource management"
Results Per Page
Sort Options
Item Open Access An Operating System Architecture for Networked Server Infrastructure(2007-12-14) Irwin, David EmoryCollections of hardware components are the foundation of computation and consist of interconnections of different types of the same core elements: processors, disks, memory cards, I/O devices, and network links. Designing a system for managing collections of hardware is challenging because modern infrastructures (i) distribute resource control across multiple autonomous sites, (ii) operate diverse sets of hardware, and (iii) support a variety of programming models for developing and executing software services. An operating system is a software layer that manages hardware by coordinating its interaction with software. This thesis defines and evaluates an architecture for a networked operating system that manages collections of hardware in infrastructures spread across networks, such as the Internet. The foundation of a networked operating system determines how software services share a common hardware platform. A fundamental property common to all forms of resource sharing is that software services, by definition, share hardware components and do not use them forever. A lease is a natural construct for restricting the use of a shared resource to a well-defined length of time. Our architecture employs a general neutrality principle, which states that a networked operating system should be policy-neutral, since only users and site administrators, and not operating system developers, know how to manage their software and hardware. Experience building, deploying, and using a prototype has led us to view neutrality as a guiding design principle. Our hypothesis is that an operating system architecture for infrastructure resource management that focuses narrowly on leasing control of hardware provides a foundation for multi-lateral resource negotiation, arbitration, and fault tolerance. In evaluating our hypothesis we make the following contributions:*Introduce a set of design principles for networked operating systems. The principles adapt and extend principles from node operating system design to a networked environment. We evaluate existing systems with respect to these principles, describe how they deviate from them, and explore how these deviations limit the capabilities of higher level software.*Combine the idea of a reconfigurable data center with the Sharp framework for secure resource peering to demonstrate a prototype networked operating system capable of sharing aggregations of resources in infrastructures. *Design, implement, and deploy the architecture using a single programming abstraction---the lease---and show how the lease abstraction embodies the design principles of a networked operating system.*Show that leases are a foundational primitive for addressing arbitration in a networked operating system. Leasing currency defines a configurable tradeoff between proportional-share scheduling and a market economy, and also serves as a basis for implementing other forms of arbitration. *Show how combining the use of leases for long-term resource management with state recovery mechanisms provides robustness to transient faults and failures in a loosely coupled distributed system that coordinates resource allocation.*Evaluate the flexibility and performance of a prototype by managing aggregations of physical and virtual hardware present in modern data centers, and showing that the architecture could scale to manage thousands of machines. *Present case studies of integrating multiple software services including the PlanetLab network testbed, the Plush distributed application manager, and the GridEngine batch scheduler, and leverage the architecture to prototype and evaluate Jaws, a new light-weight batch scheduler that instantiates one or more virtual machines per task.Item Open Access Climate Change Is Here, but Who Is Paying for It?(2020-06-24) Mulderrig, Conor; Profeta, Timothy; Thompson, ElizabethIn an era of increased political polarization and decreased confidence in national institutions, many bold initiatives have stalled or met an untimely end. This abdication of federal responsibility is no more evident than the ongoing response to COVID-19. Strategic response to the pandemic has largely shifted from the White House to governors’ mansions. A renewed era of federalism is beginning to take shape, and it is important to consider potential ramifications in other pressing areas—specifically climate change. State and local governments have advanced their own initiatives on climate change when faced with a failure of federal leadership. States have launched climate strategies to combat this growing threat both individually and in coalition. However, in order to mitigate and develop resiliency to climate change, much more needs to be done. This analysis will look at one aspect of the challenge: understanding costs. Resource management for states with ever-thin operational budgets is already logistically daunting without the massive investment in preventative measures needed to meaningfully combat climate change. Where do they start? Do states across the board have the technical capacity to understand what impacts they have already been facing? The resounding answer at this current moment is no.Item Open Access Coordinating the Design and Management of Heterogeneous Datacenter Resources(2014) Guevara, Marisabel AlejandraHeterogeneous design presents an opportunity to improve energy efficiency but raises a challenge in management. Whereas prior work separates the two, we coordinate heterogeneous design and management. We present a market-based resource allocation mechanism that navigates the performance and power trade-offs of heterogeneous architectures. Given this management framework, we explore a design space of heterogeneous processors and show a 12x reduction in response time violations when equipping a datacenter with three processor types over a homogeneous system that consumes the same power. To better understand trade-offs in large heterogeneous design spaces, we explore dozens of design strategies and present a risk taxonomy that classifies the reasons why a deployed system may underperform relative to design targets. We propose design strategies that explicitly mitigate risk, such as a strategy that minimizes the coefficient of variation in performance. In our experiments, we find that risk-aware design accounts for more than 70% of the strategies that produce systems with the best service quality. We also present a new datacenter management mechanism that fairly allocates processors to latency-sensitive applications. Tasks express value for performance using sophisticated piecewise-linear utility functions. With fairness in market allocations, we show how datacenters can mitigate envy amongst latency-sensitive users. We quantify the price of fairness and detail efficiency-fairness trade-offs. Finally, we extend the market to fairly allocate heterogeneous processors.