TensorRT inference performance study in MLModelScope
Tang, Jingning
Loading…
Permalink
https://hdl.handle.net/2142/108566
Description
Title
TensorRT inference performance study in MLModelScope
Author(s)
Tang, Jingning
Issue Date
2020-06-23
Director of Research (if dissertation) or Advisor (if thesis)
Hwu, Wen-Mei
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
deep learning
inference
TensorRT
MLModelScope
Abstract
As deep learning has been adopted in various domains, the inference process is of growing importance to ensure the deployment across multiple computing platforms. Within many deep learning frameworks that support freezing and deploying the well-trained models, NVIDIA TensorRT is the leading framework that is exclusively developed for inference. It allows the developer to optimize the model to facilitate high-performance inference. While it has been shown extensively that TensorRT can significantly boost the inference capability, quantitative study is lacking on how assorted optimization strategies can improve the inference compared to other well-known deep learning frameworks such as TensorFlow. This thesis presents such a study that consists of experiments using TensorRT on MLModelScope, a deep learning inference platform that enables standardized inference and multi-level profiling.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.