Squeezing the most benefit from network parallelism in datacenters

Ghorbani Khaledi, Soudeh

Squeezing the most benefit from network parallelism in datacenters

Ghorbani Khaledi, Soudeh

Content Files

GHORBANIKHALEDI-DISSERTATION-2016.pdf

Permalink

https://hdl.handle.net/2142/95589

Description

Title

Squeezing the most benefit from network parallelism in datacenters

Author(s)

Ghorbani Khaledi, Soudeh

Issue Date

2016-11-28

Director of Research (if dissertation) or Advisor (if thesis)

Godfrey, Brighten

Doctoral Committee Chair(s)

Godfrey, Brighten

Committee Member(s)

Rexford, Jennifer
Vaidya, Nitin
Gupta, Indranil

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Date of Ingest

2017-03-01T17:01:36Z

Keyword(s)

Parallelism
Causality
Efficiency
Load balancing
Traffic engineering
Stability
Datacenters
Replication
One-big-switch
Correctness
Transparency

Abstract

One big non-blocking switch is one of the most powerful and pervasive abstractions in datacenter networking. As Moore's law begins to wane, using parallelism to scale out processing units, vs. scale them up, is becoming exceedingly popular. The one-big-switch abstraction, for example, is typically implemented via leveraging massive degrees of parallelism behind the scene. In particular, in today's datacenters that exhibit a high degree of multi-pathing, each logical path between a communicating pair in the one-big-switch abstraction is mapped to a set of paths that can carry traffic in parallel. Similarly, each one-big-switch abstraction function, such as the firewall functionality, is mapped to a set of distributed hardware and software switches. Efficiently deploying this pool of networking connectivity and preserving the functional correctness of network functions, in spite of the parallelism, are challenging. Efficiently balancing the load among multiple paths is challenging because microbursts, responsible for the majority of packet loss in datacenters today, usually last for only a few microseconds. Even the fastest traffic engineering schemes today have control loops that are several orders of magnitude slower (a few milliseconds to a few seconds), and are therefore ineffective in controlling microbursts. Correctly implementing network functions in the face of parallelism is hard because the distributed set of elements that in parallel implement a one-big-switch abstraction can inevitably have inconsistent states that may cause them to behave differently than one physical switch. The first part of this thesis presents DRILL, a datacenter fabric for Clos networks which performs micro load balancing to distribute load as evenly as possible on microsecond timescales. To achieve this, DRILL employs packet-level decisions at each switch based on local queue occupancies and randomized algorithms to distribute load. Despite making per-packet forwarding decisions, by enforcing a tight control on queue occupancies, DRILL manages to keep the degree of packet reordering low. DRILL adapts to topological asymmetry (e.g. failures) in Clos networks by decomposing the network into symmetric components. Using a detailed switch hardware model, we simulate DRILL and show it outperforms recent edge-based load balancers particularly in the tail latency under heavy load, e.g., under 80% load, it reduces the 99.99th percentile of flow completion times of Presto and CONGA by 32% and 35%, respectively. Finally, we analyze DRILL's stability and throughput-efficiency. In the second part, we focus on the correctness of one-big-switch abstraction's implementation. We first show that naively using parallelism to scale networking elements can cause incorrect behavior. For example, we show that an IDS system which operates correctly as a single network element can erroneously and permanently block hosts when it is replicated. We then provide a system, COCONUT, for seamless scale-out of network forwarding elements; that is, an SDN application programmer can program to what functionally appears to be a single forwarding element, but which may be replicated behind the scenes. To do this, we identify the key property for seamless scale out, weak causality, and guarantee it through a practical and scalable implementation of vector clocks in the data plane. We build a prototype of COCONUT and experimentally demonstrate its correct behavior. We also show that its abstraction enables a more efficient implementation of seamless scale-out compared to a naive baseline. Finally, reasoning about network behavior requires a new model that enables us to distinguish between observable and unobservable events. So in the last part, we present the Input/Output Automaton (IOA) model and formalize networks' behaviors. Using this framework, we prove that COCONUT enables seamless scale out of networking elements, i.e., the user-perceived behavior of any COCONUT element implemented with a distributed set of concurrent replicas is provably indistinguishable from its singleton implementation.

Graduation Semester

2016-12

Type of Resource

text

Permalink

http://hdl.handle.net/2142/95589

Copyright and License Information

Owning Collections

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Squeezing the most benefit from network parallelism in datacenters

Ghorbani Khaledi, Soudeh

Permalink

Description

Owning Collections

Dissertations and Theses - Computer Science

Graduate Dissertations and Theses at Illinois PRIMARY

Log In