codon model; hybrid parallelisation; Markov chain Monte Carlo; phylogenetics; maximum likelihood; high-performance computing; positive selection
Meyer Xavier, Chopard Bastien, Salamin Nicolas (2016), Accelerating Bayesian inference for evolutionary biology models, in Bioinformatics
Davydov Iakov I., Robinson-Rechavi Marc, Salamin Nicolas (2016), State aggregation for fast likelihood computations in molecular evolution, in Bioinformatics
Valle Mario, Schabauer Hannes, Pacher Christoph, Stockinger Heinz, Stamatakis Alexandros, Robinson-Rechavi Marc, Salamin Nicolas (2014), Optimization strategies for fast detection of positive selection on phylogenetic trees, in Bioinformatics
, 30(8), 1129-1137.
The increase in computing power available nowadays is providing new opportunities to several areas of biology. Consequently, computational biology has grown in importance and is now becoming a field of research on its own. Phylogenetics has been at the forefront of the use of computational approaches largely because of the difficulty to reconstruct phylogenetic trees from molecular data. There are, however, other complex computational challenges related to phylogenetic trees. One of them is the detection of positive, or Darwinian, selection along lineages and sites of protein sequences. There has been recent and interesting advances in the development of models available to detect episodic events of positive selection. However, these mathematical and statistical developments have resulted in models that are computationally complex to efficiently implement, especially when considering large scale genomic data that are now available.In this project, we would like to provide efficient computational solutions to estimate and test advanced models of protein evolution. Our goal is to combine the expertise from three different research groups to propose multiscale modelling approaches and hybrid implementations that have proven useful in the parallelisation of complex biological systems. Our project is composed of two parts:- to develop novel algorithms to enable efficient parallelisation of the likelihood calculations of codon models- to extend our approch to a Bayesian estimation of positive selection that can take full advantage of high-performance computing infrastructuresThe project will also benefit from collaborations currently developping computing libraries and we will provide in return a platform for the easy implementation of innovative models of codon evolution scaling efficiently to tens or hundreds of thousands of genomes.