Back to overview

Impact of phenotype definition on genome-wide association signals: empirical evaluation in human immunodeficiency virus type 1 infection.

Type of publication Peer-reviewed
Publikationsform Original article (peer-reviewed)
Publication date 2011
Author Evangelou Evangelos, Fellay Jacques, Colombo Sara, Martinez-Picado Javier, Obel Niels, Goldstein David B, Telenti Amalio, Ioannidis John P A,
Project Host evolutionary genomics of HIV-1 and other retroviruses
Show all

Original article (peer-reviewed)

Journal American journal of epidemiology
Volume (Issue) 173(11)
Page(s) 1336 - 42
Title of proceedings American journal of epidemiology
DOI 10.1093/aje/kwr024


Discussion on improving the power of genome-wide association studies to identify candidate variants and genes is generally centered on issues of maximizing sample size; less attention is given to the role of phenotype definition and ascertainment. The authors used genome-wide data from patients infected with human immunodeficiency virus type 1 (HIV-1) to assess whether differences in type of population (622 seroconverters vs. 636 seroprevalent subjects) or the number of measurements available for defining the phenotype resulted in differences in the effect sizes of associations between single nucleotide polymorphisms and the phenotype, HIV-1 viral load at set point. The effect estimate for the top 100 single nucleotide polymorphisms was 0.092 (95% confidence interval: 0.074, 0.110) log(10) viral load (log(10) copies of HIV-1 per mL of blood) greater in seroconverters than in seroprevalent subjects. The difference was even larger when the authors focused on chromosome 6 variants (0.153 log(10) viral load) or on variants that achieved genome-wide significance (0.232 log(10) viral load). The estimates of the genetic effects tended to be slightly larger when more viral load measurements were available, particularly among seroconverters and for variants that achieved genome-wide significance. Differences in phenotype definition and ascertainment may affect the estimated magnitude of genetic effects and should be considered in optimizing power for discovering new associations.