Withdraw
Loading…
A comparative study of the effects of parallelization on ARM and Intel based platforms
Fellows, Kurt
Loading…
Permalink
https://hdl.handle.net/2142/50710
Description
- Title
- A comparative study of the effects of parallelization on ARM and Intel based platforms
- Author(s)
- Fellows, Kurt
- Issue Date
- 2014-09-16
- Director of Research (if dissertation) or Advisor (if thesis)
- Torrellas, Josep
- Mitra, Sayan
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Single Instruction Multiple Data (SIMD)
- Thread Building Blocks
- OpenMP
- NEON
- Streaming SIMD Extensions (SSE)
- Advanced Vector Extensions (AVX)
- Abstract
- With the enormous growth in popularity of mobile devices in the past decade, there has been a large push in industry for chip designers and manufacturers to develop powerful yet energy efficient processors. Increasing the parallelism available in the hardware has proven to be a great way to maintain and even improve performance while sustaining a manageable power budget. Specialized hardware such as graphics processing units, multicore systems and vector units are some of the hardware that has allowed the goal of improving performance while maintaining energy efficiency to be realized. These examples of specialized hardware are able to provide great benefits to applications that have computationally intensive algorithms. Such algorithms like video stabilization, object detection and 3D gaming, to name a few, are excellent candidates for making use of this hardware. Also, applications like these are just a few among the many computationally intensive applications found on mobile devices today. This work examines the effects of optimizations using some of the previously mentioned hardware on two different platforms. The first is an ARM based development board and the second an Intel based Ultrabook. Similar optimizations are applied to two computer vision applications. These optimizations are applied on two different levels. First, optimizations were made on a thread level and included utilizing vector units and manipulating control flow to more effectively use the cache. The second set of optimizations was made on a processor level and involved making use of the multiple cores on a chip with OpenMP and Thread Building Blocks. We based the performance of the platforms on three metrics: throughput, energy per frame and throughput per energy, a metric similar to that of the energy-delay product. After performing varying combinations of the optimizations, we ultimately found the Intel based Ultrabook to be the better choice of platform. On the more memory bound vision application, the best configuration on the Ultrabook had a throughput of almost 4x that of the ARM development board with 2x the energy efficiency. The results for the more compute bound application were closer, with the Ultrabook’s best configuration having a throughput of less than 3x that of the development board and only about 1.5x as energy efficient.
- Graduation Semester
- 2014-08
- Permalink
- http://hdl.handle.net/2142/50710
- Copyright and License Information
- Copyright 2014 Kurt Fellows
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…