PhD, Masters and Honours projects available to commence in 2016
- Missing data biomarker models for cancer
Finding biological markers for cancer is a primary goal of bioinformatics.
One of the techniques is mass spectrometry which produces protein signals from serum.
The resulting protein profiles are subject to several sources of bias and variability which
must be adjusted for in the analysis of the data. The protein profiles also suffer from missing data.
A large proportion of the missing data is due to the true biological signal falling below the machine
detection limit. This research project will study what missing data means from a statistical viewpoint,
the different types of missingness that arise in genomics and proteomics studies, and will ultimately create
a joint observed/missing data Bayesian biomarker model for gastric cancer. The research involves
both theoretical statistics and statistical computation, and will include multilevel models,
evaluating model performance
using cross-validation for simulated and observed data, and Bayesian model fitting and estimation.
- Predicting sarcopenia in seniors
Sarcopenia is the loss of muscle mass and physical function observed with increasing age.
The gold standard for diagnosis is the use of dual x-ray absorptiometry (DXA) but this is not readily
available to patients in primary or aged care. Linear regression models using easily measured patient
variables such as age and sex have been compared to models based on DXA, but the prediction models
obtained so far demonstrate only modest sensitivity (the true positive rate) and specificity
(the true negative rate) for sarcopenia. The aim of this research project is to develop
statistical and machine learning models with superior predictive performance that can
be used by clinicians to accurately diagnose sarcopenia in patients. Optimal regression
models for cohort data from the North West Adelaide Health Study Cohort will be
compared with machine learning regression and classification methods, including
random forests and support vector machines. The research will be conducted
in collaboration with Professor Renuka Visvanathan and Dr Solomon Yu of the
Aged and Extended Care Services, The Queen Elizabeth Hospital.
- Big data survival analysis
Survival analysis is about the analysis of time to event data.
Such data are subject to different types of incomplete observation,
the most common being right censoring. Increasingly in big data studies,
survival analysis is being used to analyse high-dimensional datasets in which,
for example, many thousands of gene sequences are available for each individual.
The aim of these studies is to use the genetic or molecular signatures to predict
the survival outcome for each individual. However, traditional statistical survival
methods are not designed to handle these situations and alternative approaches,
particularly machine learning methods, promise better predictive power.
This research project will develop genetic algorithms, support vector machines and
random forest survival models to exploiting genetic and molecular data in
survival analysis. The modelling and computational work will involve simulation
studies to evaluate model performance and prediction error.
Major applications will include to predicting survival following chemotherapy
in patients with large B-cell lymphoma and to modelling outcomes in patients
undergoing joint replacement surgery at the Royal Adelaide Hospital.