Withdraw
Loading…
Survival analysis for lung cancer patients
Leu, Anh
Loading…
Permalink
https://hdl.handle.net/2142/110853
Description
- Title
- Survival analysis for lung cancer patients
- Author(s)
- Leu, Anh
- Issue Date
- 2021-04-27
- Director of Research (if dissertation) or Advisor (if thesis)
- Do, Minh
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- survival analysis
- lung cancer
- lung cancer prognosis
- Cox proportional hazards
- Meier estimator
- random survival forests
- machine learning
- Abstract
- Cancer is one of the leading causes of death. Lung cancer, in particular, is the leading cause of cancer death in both men and women, accounting for 23% of all cancer deaths in 2019 according to the Centers for Disease Control and Prevention. One particular problem with lung cancer is that it usually has a poor prognosis, with a five years survival rate of only 21% according to the SEER Cancer Statistics Review, 1975-2017. With such a deadly disease, it is crucial to predict the survival likelihood of cancer patients. However, this is not an easy task due to the many factors affecting the disease progression. This thesis is based on the existing National Lung Screening Trial (NLST) dataset and provides in-depth analysis of different features influencing lung cancer prognosis. We added nodule annotations to the NLST dataset and extracted radiomic features from each nodule. Using the newly acquired radiomic features, coupled with the existing clinical data from the original NLST dataset, we examined different prognostic models to predict the event of death by lung cancer from the first low-dose computed tomography (LDCT) scan. The model using both clinical and radiomic features shows relative performance improvements compared to the models using only the clinical information, signifying the importance of additional radiomic features. While the best model's concordance index using clinical input is 0.589, the concordance index of the best model using a combination of clinical and radiomic features is 0.657. We performed rigorous cross-examination on each feature's relationship and the model for each feature type using data analysis information and survival analysis models. For each feature type, we used one representative survival analysis model from semi-parametric methods (Cox proportional hazards model), one from non-parametric methods (Kaplan-Meier estimator), and one from machine learning approaches (random survival forests). Using the results obtained from these different methods, we identified the best feature types and model combinations to get the top performance for various follow-up periods. The best model is random survival forests with a combination of clinical and radiomic features as input. Roughly 330 days after the first scan, the combination model achieves a 30-day mean cumulative/dynamic area under the receiver operating characteristic of approximately 0.8 for about one year, peaking 810 days after the first scan at 0.839.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110853
- Copyright and License Information
- copyright Anh Leu 2021
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…