Withdraw
Loading…
QR factorization over tunable processor grids
Hutter, Edward
Loading…
Permalink
https://hdl.handle.net/2142/97860
Description
- Title
- QR factorization over tunable processor grids
- Author(s)
- Hutter, Edward
- Contributor(s)
- Solomonik, Edgar
- Issue Date
- 2017-05
- Keyword(s)
- communication cost
- parallel numerical algorithms
- least squares problems
- QR factorization
- Abstract
- The increasing complexity of modern computer architectures has greatly influenced algorithm design. Algorithm performance on these architectures is now determined by the movement of data. Therefore, modern algorithms should prioritize minimizing communication. In this work, we present a new parallel QR factorization algorithm solved over a tunable processor grid in a distributed memory environment. The processor grid can be tuned between one and three dimensions, resulting in tradeoffs in the asymptotic costs of synchronization, horizontal bandwidth, flop count, and memory footprint. This parallel algorithm is the first to efficiently extend the Cholesky-QR2 algorithm to matrices with an arbitrary number of rows and columns. Along its critical path of execution on P processors, our tunable algorithm improves upon the horizontal bandwidth cost of the existing Cholesky-QR2 algorithm by up to a factor of c^2 when solved over a c x d x c processor grid subject to P = c^2 d and E[1,P^1/3]. The costs attained by our algorithm are asymptotically equivalent to state-of-the-art QR factorization algorithms that have yet to be implemented. We argue that ours achieves better practicality and flexibility while still attaining minimal communication.
- Type of Resource
- text
- Language
- en
- Permalink
- http://hdl.handle.net/2142/97860
Owning Collections
Senior Theses - Electrical and Computer Engineering PRIMARY
The best of ECE undergraduate researchManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…