Browsing by Subject "Distributed systems"
- Results Per Page
- Sort Options
Item Open Access Improving Congestion Control Convergence in RDMA Networks(2022) Snyder, JohnRemote Direct Memory Access (RDMA) networks are becoming a popular interconnect technology to enable high performance communication in distributed systems. While RDMA hardware enables high bandwidth and low latency networking, networks require congestion control algorithms to ensure they operate efficiently. Ideally, a congestion control algorithm allows computers in the network to inject enough traffic to keep the network fully utilized, stops computers from causing congestion, and allocates bandwidth fairly to all computers in the network. This enables the network to operate at peak performance and be fair to all users. While many protocols eventually converge to this ideal network state over time, they often take too long, reducing performance. We develop mechanisms and protocols that improve convergence time.
In this thesis, we identify several ways in which slow convergence to the ideal network state harms performance and leaves RDMA networks susceptible to performance isolation attacks. We show that slow convergence to fair injection rates on end-hosts in RDMA networks greatly increases the communication time for long flows, which leads to sub-optimal application performance. We identify why unfairness occurs and measure unfairness' impact on application performance. We then show that because RDMA networks are loss-less and start sending packet at line-rate, users can unfairly gain more bandwidth and sometimes ignore congestion control altogether. This allows misbehaving users to harm the performance of other users sharing the network.
To improve long flow performance in RDMA networks, we propose two new mechanisms for Additive-Increase Multiplicative-Decrease protocols: 1) Variable Additive Increase and 2) Sampling Frequency. These mechanisms reduce the time it takes for the network to allocate bandwidth fairly between end-hosts, so long flows are not starved of bandwidth. To create these mechanisms, we determine when unfairness occurs and how end-hosts can infer unfairness without any additional information from switches.
We then introduce One Round Trip Time Convergence (1RC) and a new method of setting flow weights, which improve performance and isolation in RDMA networks. 1RC enables a network to converge to fair rates during the first RTT by dropping packets when a flow uses too much bandwidth. We do this while maintaining packet ordering and fairness between flows that start sending packets at the same time. We then use a new weighting scheme, which decreases the bandwidth allocation to users opening too many connections. A lower weight for misbehaving users mitigates the impact of a user trying to gain more bandwidth than is fair.
Finally, we introduce the Collective Congestion Control Protocol (3CPO), which improves convergence in multicast networks designed for collective communication. Multicast operations send a single packet to several destinations at once and cause severe congestion in the network quickly. 3CPO manages multicast congestion by inferring global congestion state through multicast operations and then tunes each end-host injection rate to be near optimal. 3CPO requires no extra information from switches and works entirely on end-hosts.
Item Open Access SAFE: A Declarative Trust-Agile System with Linked Credentials(2016) Thummala, VamsidharSecure Access For Everyone (SAFE), is an integrated system for managing trust
using a logic-based declarative language. Logical trust systems authorize each
request by constructing a proof from a context---a set of authenticated logic
statements representing credentials and policies issued by various principals
in a networked system. A key barrier to practical use of logical trust systems
is the problem of managing proof contexts: identifying, validating, and
assembling the credentials and policies that are relevant to each trust
decision.
SAFE addresses this challenge by (i) proposing a distributed authenticated data
repository for storing the credentials and policies; (ii) introducing a
programmable credential discovery and assembly layer that generates the
appropriate tailored context for a given request. The authenticated data
repository is built upon a scalable key-value store with its contents named by
secure identifiers and certified by the issuing principal. The SAFE language
provides scripting primitives to generate and organize logic sets representing
credentials and policies, materialize the logic sets as certificates, and link
them to reflect delegation patterns in the application. The authorizer fetches
the logic sets on demand, then validates and caches them locally for further
use. Upon each request, the authorizer constructs the tailored proof context
and provides it to the SAFE inference for certified validation.
Delegation-driven credential linking with certified data distribution provides
flexible and dynamic policy control enabling security and trust infrastructure
to be agile, while addressing the perennial problems related to today's
certificate infrastructure: automated credential discovery, scalable
revocation, and issuing credentials without relying on centralized authority.
We envision SAFE as a new foundation for building secure network systems. We
used SAFE to build secure services based on case studies drawn from practice:
(i) a secure name service resolver similar to DNS that resolves a name across
multi-domain federated systems; (ii) a secure proxy shim to delegate access
control decisions in a key-value store; (iii) an authorization module for a
networked infrastructure-as-a-service system with a federated trust structure
(NSF GENI initiative); and (iv) a secure cooperative data analytics service
that adheres to individual secrecy constraints while disclosing the data. We
present empirical evaluation based on these case studies and demonstrate that
SAFE supports a wide range of applications with low overhead.
Item Open Access System Support for Strong Accountability(2009) Yumerefendi, Aydan RafetComputer systems not only provide unprecedented efficiency and
numerous benefits, but also offer powerful means and tools for
abuse. This reality is increasingly more evident as deployed software
spans across trust domains and enables the interactions of
self-interested participants with potentially conflicting goals. With
systems growing more complex and interdependent, there is a growing
need to localize, identify, and isolate faults and unfaithful behavior.
Conventional techniques for building secure systems, such as secure
perimeters and Byzantine fault tolerance, are insufficient to ensure
that trusted users and software components are indeed
trustworthy. Secure perimeters do not work across trust domains and fail
when a participant acts within the limits of the existing security
policy and deliberately manipulates the system to her own
advantage. Byzantine fault tolerance offers techniques to tolerate
misbehavior, but offers no protection when replicas collude or are
under the control of a single entity.
Complex interdependent systems necessitate new mechanisms that
complement the existing solutions to identify improper behavior and
actions, limit the propagation of incorrect information, and assign
responsibility when things go wrong. This thesis
addresses the problems of misbehavior and abuse by offering tools and
techniques to integrate accountability into computer systems. A
system is accountable if it offers means to identify and expose
semantic misbehavior by its participants. An accountable system
can construct undeniable evidence to demonstrate its correctness---the
evidence serves as explicit proof of misbehavior and can be strong enough
to be used as a basis for social sanction external to the
system.
Accountability offers strong disincentives for abuse and
misbehavior but may have to be ``designed-in'' to an application's
specific protocols, logic, and internal representation; achieving
accountability using general techniques is a challenge. Extending
responsibility to end users for actions performed by software
components on their behalf is not trivial, as it requires an ability
to determine whether a component correctly represents a
user's intentions. Leaks of private information are yet another
concern---even correctly functioning
applications can leak sensitive information, for which their owners
may be accountable. Important infrastructure services, such as
distributed virtual resource economies, offer a range of application-specific
issues such as fine-grain resource delegation, virtual
currency models, and complex work-flows.
This thesis work addresses the aforementioned problems by designing,
implementing, applying, and evaluating a generic methodology for
integrating accountability into network services and applications. Our
state-based approach decouples application state management from
application logic to enable services to demonstrate that they maintain
their state in compliance with user requests, i.e., state changes do take
place, and the service presents a consistent view to all clients and
observers. Internal state managed in this way, can then be used to feed
application-specific verifiers to determine the correctness the service's
logic and to identify the responsible party. The state-based approach
provides support for strong accountability---any detected violation
can be proven to a third party without depending on replication and
voting.
In addition to the generic state-based approach, this thesis explores how
to leverage application-specific knowledge to integrate accountability in
an example application. We study the invariants and accountability
requirements of an example application--- a lease-based virtual resource
economy. We present the design and implementation of several key elements
needed to provide accountability in the system. In particular, we describe
solutions to the problems of resource delegation, currency spending, and
lease protocol compliance. These solutions illustrate a complementary
technique to the general-purpose state-based approach, developed in the
earlier parts of this thesis.
Separating the actions of software and its user is at the heart of the
third component of this dissertation. We design, implement, and evaluate
an approach to detect information leaks in a commodity operating system.
Our novel OS abstraction---a doppelganger process---helps track
information flow without requiring application rewrite or instrumentation.
Doppelganger processes help identify sensitive data as they are about to
leave the confines of the system. Users can then be alerted about the
potential breach and can choose to prevent the leak to avoid becoming
accountable for the actions of software acting on their behalf.
Item Open Access Towards Energy-Efficient Mobile Sensing: Architectures and Frameworks for Heterogeneous Sensing and Computing(2016) Fan, SongchunModern sensing apps require continuous and intense computation on data streams. Unfortunately, mobile devices are failing to keep pace despite advances in hardware capability. In contrast to powerful system-on-chips that rapidly evolve, battery capacities merely grow. This hinders the potential of long-running, compute-intensive sensing services such as image/audio processing, motion tracking and health monitoring, especially on small, wearable devices.
In this thesis, we present three pieces of work that target at improving the energy efficiency for mobile sensing. (1) In the first work, we study heterogeneous mobile processors that dynamically switch between high-performance and low-power cores according to tasks' performance requirements. We benchmark interactive mobile workloads and quantify the energy improvement of different microarchitectures. (2) Realizing that today's users often carry more than one mobile devices, in the second work, we extend the resource boundary of individual devices by prototyping a distributed framework that coordinates multiple devices. When devices share common sensing goals, the framework schedules sensing and computing tasks according to devices' heterogeneity, improving the performance and latency for compute-intensive sensing apps. (3) In the third work, we study the power breakdown of motion sensing apps on wearable devices and show that traditional offloading schemes cannot mitigate sensing’s high energy costs. We design a framework that allows the phone to take over sensing and computation by predicting the wearable's sensory data, when motions of the two devices are highly correlated. This allows the wearable to offload without communicating raw sensing data, resulting in little performance loss but significant energy savings.