Lay summary

RNA viruses are characterized by short generation time and high mutation rates. Therefore even over relatively short time spans epidemiological processes leave a fingerprint in the genetic structure of viral sequences sampled from the host population. As a consequence, both epidemiological and evolutionary processes can be inferred from virus sequences. However, until now epidemiological virus sequence data have been analyzed without assuming explicit epidemiological models, and therefore the quantification of key epidemiological parameters was not possible.

The objective of this research proposal is to merge epidemiological and evolutionary models, so that both epidemiological and evolutionary parameters can be estimated on the basis of viral sequence data. The methodology will become available as part of the widely used Beast and Paml software packages.

I will use the novel methodology for three applications:

I will first investigate which treatment and intervention strategies had a significant impact on the epidemic spread of HIV in the past. The results will have the potential to support the efforts in designing improved intervention strategies in the future for HIV.

Second, I will address a key empirical paradox challenging our understanding of HIV evolutionary biology, namely the apparent incompatibility between the observed fast evolutionary rates obtained from within-host data and slow evolutionary rates obtained from between-host data.

Third, the methodology will be applied to a variety of viral epidemics in order to recover the specific epidemiological dynamics. I plan to study Influenza virus, Dengue virus, Hepatitis C virus and Rhinovirus, viruses for which enough data is available to do phylogenetic analyses.