Inferring protein conformational ensemble using deep learning and evolutionary couplings
Xue, Zhengyuan
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/115600
Description
Title
Inferring protein conformational ensemble using deep learning and evolutionary couplings
Author(s)
Xue, Zhengyuan
Issue Date
2022-05-02
Director of Research (if dissertation) or Advisor (if thesis)
Shukla, Diwakar
Department of Study
School of Molecular & Cell Bio
Discipline
Biophysics & Quant Biology
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Protein Conformational Dynamics, Deep Learning
Abstract
Protein conformational dynamics is crucial for understanding protein structure-function relationship, but hard to sample and analyze due to their high-dimensional nature. Evolutionary couplings (ECs) have been proven to contain information about the structure and function of proteins, thus can be used for predicting residue contact or functional effect of mutations. We here show that ECs can also be used to infer protein dynamics. We present dynamicEC, a deep learning model based on ResNet that takes multiple kinds of features to predict dynamic contact residue pairs. After being trained on a dataset of over 220000 experimental protein structures, our model achieved over 95\% in overall accuracy, and can achieve over 0.89 on the area under the receiver operating characteristic (AUROC) on the test set. By using the inter-residue distance between the predicted residue pairs to construct Markov State Models and computing GMRQ score, we further prove that the predicted residue pairs of our model can be used as reaction coordinates for efficient sampling of protein conformational landscape and analysis of molecular dynamics simulation. Our work has proven that the evolutional correlation between residues can provide not only information on static structures, but also dynamical insights of proteins. It can be used as a cheap and fast way to guide enhanced sampling with little prior knowledge required (just sequence information). The predicted residue pairs can serve as reaction coordinates and be used for MD data analysis such as dimensionality reduction and understanding the slow motions of proteins. Overall, our work provides a data-driven method for characterizing the conformational ensemble of protein from sequential data using deep learning.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.