Risk prediction survival model utilising both omics data and clinical data
Yunwei Zhang0, Jean Yang0, Samuel Mueller1, Graham Mann2
(0) School of Mathematics and Statistics and Charles Perkins Centre, Faculty of Science, The University of Sydney
(1) School of Mathematics and Statistics, Faculty of Science, The University of Sydney
(2) John Curtin School of Medical Research, Australian National University;Melanoma Institute Australia
Find me on Tues Nov 24th, 1:40-3pm AEDT in Remo, table 61
Abstract
Utilisation of omics data such as gene expression data for disease diagnosis, e.g. cancer stage prediction has been well established in recent years. However, across the literature, the survival information has been mostly put aside. It is common to define the study-specific patients’ outcome by truncating the raw time information at some pre-defined time point. Also, patients’ clinical features are not included in most of the studies. Therefore, we aim to utilise both the survival time information and clinical information to build risk prediction models as well as identifying potential significant clinical variables.
We use a Melanoma data set in our study. Survival times are included via establishing survival models: random survival forest (RSF) and penalised Cox model are used in our study, instead of binary classification models. Pre-validated methods are used to pre-validate the omics data information to be combined with the clinical information. We compare the survival model performances and the clinical variables identified corresponding to each model.
Our preliminary results show that with a considerable concordance index which gives the goodness-of-fit of the survival models, some clinical variables are significant in predicting patient’s outcome. As a comparison between RSF and penalised cox model, RSF slightly performs better. In conclusion, clinical and survival time information should be used simultaneously and additionally with omics data to enhance patients’ risk prediction.
Comments