Numerical Approximations; Evolutionary History; Natural Selection; Population Genetics; Human Genetics; Machine Learning
Ferrer-Admetlla Anna, Leuenberger Christoph, Jensen Jeffrey D, Wegmann Daniel (2016), An Approximate Markov Model for the Wright-Fisher Diffusion and Its Application to Time Series Data., in Genetics
, 203(2), 831-46.
Hofmanová Zuzana, Kreutzer Susanne, Hellenthal Garrett, Sell Christian, Diekmann Yoan, Díez-Del-Molino David, van Dorp Lucy, López Saioa, Kousathanas Athanasios, Link Vivian, Kirsanow Karola, Cassidy Lara M, Martiniano Rui, Strobel Melanie, Scheu Amelie, Kotsakis Kostas, Halstead Paul, Shennan Stephen J, Bradley Daniel G, Currat Mathias, Veeramah Krishna R, Wegmann Daniel, Thomas Mark G, Papageorgopoulou Christina, Burger Joachim (2016), Early farmers from across Europe directly descended from Neolithic Aegeans., in Proceedings of the National Academy of Sciences of the United States of America
, 113(25), 6886-91.
Broushaki Farnaz, Thomas Mark G, Link Vivian, López Saioa, van Dorp Lucy, Kirsanow Karola, Hofmanová Zuzana, Diekmann Yoan, Cassidy Lara M, Díez-del-Molino David, Kousathanas Athanasios, Sell Christian, Robson Harry K, Martiniano Rui, Blöcher Jens, Scheu Amelie, Kreutzer Susanne, Bollongino Ruth, Bradley Daniel G, Shennan Stephen, Veeramah Krishna R, Mashkour Marjan, Wegmann Daniel, Hellenthal Garrett, Burger Joachim (2016), Early Neolithic genomes from the eastern Fertile Crescent, in Science (New York, N.Y.)
, 353(6298), 499-503.
Kousathanas Athanasios, Leuenberger Christoph, Helfer Jonas, Quinodoz Mathieu, Foll Matthieu, Wegmann Daniel (2016), Likelihood-Free Inference in High-Dimensional Models., in Genetics
, 203(2), 893-904.
Foll Matthieu, Poh Yu-Ping, Renzette Nicholas, Ferrer-Admetlla Anna, Bank Claudia, Shim Hyunjin, Malaspinas Anna-Sapfo, Ewing Gregory, Liu Ping, Wegmann Daniel, Caffrey Daniel R, Zeldovich Konstantin B, Bolon Daniel N, Wang Jennifer P, Kowalik Timothy F, Schiffer Celia A, Finberg Robert W, Jensen Jeffrey D (2014), Influenza virus drug resistance: a time-sampled population genetics perspective., in PLoS genetics
, 10(2), 1004185-1004185.
Duchen-Bocangle Pablo, Leuenberger Christoph, Szilágyi Sándor M, Harmon Luke, Eastman Jonathan, Schweizer Manuel, Wegmann Daniel, Inference of evolutionary jumps in large phylogenies using Lévy processes, in Systematic Biology
Kousathanas Athanasios, Leuenberger Christoph, Link Vivian, Sell Christian, Burger Joachim, Wegmann Daniel, Inferring Heterozygosity from Ancient and Low Coverage Genomes., in Genetics
Detecting signatures of past selective events provides insights into the evolutionary history of a species by evidencing adaptive events. The identification of molecular targets of selection in humans, for instance, pinpoints biologically relevant differences between us and other apes. In addition, inferences regarding selection provide important functional information by elucidating the interaction between genotype and phenotype. Since positions in the genome that are under selection must be functionally important, detecting signatures of selection has also been used extensively to identify functional regions or protein residues. Finally, inferring the molecular locations at which selection is acting may help us to predict responses to selective pressures in organisms such as viruses, which would revolutionize the management of pandemics and the development of drugs. Unfortunately, the demographic history is a major confounding factor when inferring past selective events. Indeed, neutrality tests are very sensitive to violations of the underlying assumption of constant population size, with false positive rates found to be as high as 90\% after a severe population bottleneck. Current approaches to deal with this problem rely on the assumption that selection is acting on a few loci only, while demography affects all loci equally. A two step procedure has been proposed in which a set of neutral loci are used to calibrate a demographic model against which putatively selected loci are compared. However, recent evidence suggests that selection may be common in the genome of many organisms and a priori knowledge on the neutrality of markers is often difficult to obtain. As a result, such an approach relies on very strong assumptions regarding the pervasiveness of adaptive mutations and may hence suffer from high false negative rates.There is currently no approach to estimate demography and selection jointly. However, recent advances in computational approaches offer new hopes to tackle such an inference. A particularly promising approach is Approximate Bayesian Computation (ABC), a technique to sidestep analytical likelihood calculations with simulations. To this end, ABC has been used to infer a wide range of evolutionary scenarios such as population bottlenecks, population splits and migration, but also to distinguish between a classic selective sweep and recurrent selective events.Here we propose to develop new approaches to genuinely estimate demography and selection jointly. We will begin by developing an ABC framework to estimate demography and selection jointly from unlinked loci, and to apply it to a variety of organisms with very different evolutionary histories. Since an application of ABC to large scale data sets is tenuous, major new developments are needed to reach this goal. Here we propose to develop a new ABC-MCMC algorithms with increased performance, an efficient recycling of simulations, and extending recent approaches to hybridize ABC with traditional full-likelihood methods. Next, we will develop new approaches to infer selection and demography genome-wide from a large set of linked loci. To achieve this, we will make extensive use of auto-regressive hidden Markov models (arHMM), an extension of a classic hidden Markov model that will allow us to take linkage between site more accurately into account. We will first use this technique to extend an existing approach to estimate parameters of an island model along with locus specific strengths of selection. Besides inferring the distribution of selective effects genome wide, such a model will also be readily applicable to genome-wide association studies by treating cases and controls as separate, yet related populations. Finally, we will attempt to include models allowing population size changes (e.g. bottlenecks) into the proposed arHMM by approximating emission probabilities using simulations, similar to the ABC framework. The proposed innovations will allow us to work towards answering some of the most controversial questions in evolutionary biology, namely the importance of adaptation in shaping genomic variation. We will approach these questions by inferring genome-wide selection coefficients of four organisms representing various selective and demographic histories: Drosophila melanogaster, humans and the human cytomegalovirus. These estimates will not only have broad implications for the development of new drugs, but will also greatly improve our understanding of the mode and tempo of adaptation.