Browsing by Subject "Cache Coherence"
- Results Per Page
- Sort Options
Item Open Access A Formal Framework for Designing Verifiable Protocols(2017) Matthews, OpeoluwaProtocols play critical roles in computer systems today, including managing resources, facilitating communication, and coordinating actions of components. It is highly desirable to formally verify protocols, to provide a mathematical guarantee that they behave correctly. Ideally, one would pass a model of a protocol into a formal verification tool, push a button, and the tool uncovers bugs or certifies that the protocol behaves correctly. Unfortunately, as a result of the state explosion problem, automated formal verification tools struggle to verify the increasingly complex protocols that appear in computer systems today.
We observe that design decisions have a significant impact on the scalability of verifying a protocol in a formal verification tool. Hence, we present a formal framework that guides architects in designing protocols specifically to be verifiable with state of the art formal verification tools. If architects design protocols to fit our framework, the protocols inherit scalable automated verification. Key to our framework is a modular approach to constructing protocols from pre-verified subprotocols. We formulate properties that can be proven in automated tools to hold on these subprotocols, guaranteeing that any arbitrary composition of the subprotocols behaves correctly. The result is that we can design complex hierarchical (tree) protocols that are formally verified, using fully automated tools, for any number of nodes or any configuration of the tree. Our framework is applicable to a large class of protocols, including power management, cache coherence, and distributed lock management protocols.
To demonstrate the efficacy of our framework, we design and verify a realistic hierarchical (tree) coherence protocol, using a fully automated tool to prove that it behaves correctly for any configuration of the tree. We identify certain protocol optimizations prohibited by our framework or the state of the art verification tools and we evaluate our verified protocol against unverifiable protocols that feature these optimizations. We find that these optimizations have a negligible impact on performance. We hope that our framework can be used to design a wide variety of protocols that are verifiable, high-performing, and architecturally flexible.
Item Open Access Exploiting Parallelism in GPUs(2014) Hechtman, Blake AlanHeterogeneous processors with accelerators provide an opportunity to improve performance within a given power budget.
Many of these heterogeneous processors contain Graphics Processing Units (GPUs) that can perform graphics and embarrassingly parallel computation orders of magnitude faster than a CPU while using less energy. Beyond these obvious applications for GPUs, a larger variety of applications can benefit from a GPU's large computation and memory bandwidth. However, many of these applications are irregular and, as a result, require synchronization and scheduling that are commonly believed to perform poorly on GPUs. The basic building block of synchronization and scheduling is memory consistency, which is, therefore, the first place to look for improving performance on irregular applications. In this thesis, we approach the programmability of irregular applications on GPUs by thinking across traditional boundaries of the compute stack. We think about architecture, microarchitecture and runtime systems from the programmers perspective. To this end, we study architectural memory consistency on future GPUs with cache coherence. In addition, we design a GPU memory system
microarchitecture that can support fine-grain and coarse-grain synchronization without sacrificing throughput. Finally, we develop a task runtime that embraces the GPU microarchitecture to perform well
on fork/join parallelism desired by many programmers. Overall, this thesis contributes non-intuitive solutions to improve the performance and programmability of irregular applications from the programmer's perspective.
Item Open Access Scalably Verifiable Cache Coherence(2013) Zhang, MengThe correctness of a cache coherence protocol is crucial to the system since a subtle bug in the protocol may lead to disastrous consequences. However, the verification of a cache coherence protocol is never an easy task due to the complexity of the protocol. Moreover, as more and more cores are compressed into a single chip, there is an urge for the cache coherence protocol to have higher performance, lower power consumption, and less storage overhead. People perform various optimizations to meet these goals, which unfortunately, further exacerbate the verification problem. The current situation is that there are no efficient and universal methods for verifying a realistic cache coherence protocol for a many-core system.
We, as architects, believe that we can alleviate the verification problem by changing the traditional design paradigm. We suggest taking verifiability as a first-class design constraint, just as we do with other traditional metrics, such as performance, power consumption, and area overhead. To do this, we need to incorporate verification effort in the early design stage of a cache coherence protocol and make wise design decisions regarding the verifiability. Such a protocol will be amenable to verification and easier to be verified in a later stage. Specifically, we propose two methods in this thesis for designing scalably verifiable cache coherence protocols.
The first method is Fractal Coherence, targeting verifiable hierarchical protocols. Fractal Coherence leverages the fractal idea to design a cache coherence protocol. The self-similarity of the fractal enables the inductive verification of the protocol. Such a verification process is independent of the number of nodes and thus is scalable. We also design example protocols to show that Fractal Coherence protocols can attain comparable performance compared to a traditional snooping or directory protocol.
As a system scales hierarchically, Fractal Coherence can perfectly solve the verification problem of the implemented cache coherence protocol. However, Fractal Coherence cannot help if the system scales horizontally. Therefore, we propose the second method, PVCoherence, targeting verifiable flat protocols. PVCoherence is based on parametric verification, a widely used method for verifying the coherence of a flat protocol with infinite number of nodes. PVCoherence captures the fundamental requirements and limitations of parametric verification and proposes a set of guidelines for designing cache coherence protocols that are compatible with parametric verification. As long as designers follow these guidelines, their protocols can be easily verified.
We further show that Fractal Coherence and PVCoherence can also facilitate the verification of memory consistency, another extremely challenging problem. One piece of previous work proves that the verification of memory consistency can be decomposed into three steps. The most complex and non-scalable step is the verification of the cache coherence protocol. If we design the protocol following the design methodology of Fractal Coherence or PVCoherence, we can easily verify the cache coherence protocol and overcome the biggest obstacle in the verification of memory consistency.
As system expands and cache coherence protocols get more complex, the verification problem of the protocol becomes more prominent. We believe it is time to reconsider the traditional design flow in which verification is totally separated from the design stage. We show that by incorporating the verifiability in the early design stage and designing protocols to be scalably verifiable in the first place, we can greatly reduce the burden of verification. Meanwhile, we perform various experiments and show that we do not lose benefits in performance as well as in other metrics when we obtain the correctness guarantee.