aging; dementia ; statistical analysis; magnetic resonance imaging; machine learning; heterogeneity
Yang Zhijian, Nasrallah Ilya M., Shou Haochang, Wen Junhao, Doshi Jimit, Habes Mohamad, Erus Guray, Abdulkadir Ahmed, Resnick Susan M., Albert Marilyn S., Maruff Paul, Fripp Jurgen, Morris John C., Wolk David A., Davatzikos Christos, Fan Yong, Bashyam Vishnu, Mamouiran Elizabeth, Melhem Randa, Pomponio Raymond, Sahoo Dushyant, Ashish Singh, Skampardoni Ioanna, Sreepada Lasya, et al. (2021), A deep learning framework identifies dimensional representations of Alzheimer’s Disease from brain structure, in Nature Communications
, 12(1), 7065-7065.
The state and progression of the cognitive profile of elderly individuals is highly variable and clinically relevant. Sources of variability include effects of healthy aging (including experience and education), acute brain damage, and multiple effects of pathological neuro-degeneration attributed to clinical syndromes. Another part of the variability can be attributed to individual biomarkers. However, a considerable amount of variability in the cognitive profiles and their prognosis remains unexplained. If a disease manifests in a discrete number of sub-types, the implicit characteristics encoded in the grouping may be an additional source of variability. To test this hypothesis, we propose to group individuals based on a detailed biomarker characterization and assess the effect of the grouping variable on the prediction accuracy of the cognitive profile and the prognosis thereof. We propose to implement the grouping with a framework for a data-driven characterization of individuals based on their profile of biomarkers. While biomarkers derived from standard structural MRI are readily available, biomarkers from invasive and expensive procedures are not routinely acquired. The limited set of biomarkers still contains relevant information. To extend the field of application of the characterization to settings with missing data, we propose to use the hidden activation pattern of a deep neural multi-task encoder-decoder network trained with missing data and soft constraints as signature feature representation that is robust to missing data. To characterize the individuals, we first extract multiple morphological markers from structural MRI including regional brain volume, regional atrophy, and lesion load. The extracted biomarkers are then used to group individuals using a probabilistic staging model, a model of clustered trajectories, and a model of semi-supervised learning. We then assess the variations in biomarker and cognitive variables across groups and whether the grouping variables contribute to a better prognosis. Our contribution is combining our own recently developed state-of-the-art computerized brain morphometry algorithms and alternative promising markers based on MRI with three existing sophisticated grouping methods and the application/validation to/with a large data set. We evaluate the correlations on a large (N>5000) heterogeneous data set from elderly individuals and on a data set with age range between 22 and 84 years (N=1836). Using longitudinal data (N>200), we will assess the stability of the grouping and the contribution of the grouping to the quality of prognosis. To produce consistent groupings under presence of missing data, we propose the use of a deep encoder-decoder artificial neural network that extracts signature features that are robust to missing data and preserve the cross-subject variance to obtain the same grouping. The level of consistency between the prediction with and without a set of variables is an indirect measure of redundancy. We validate the use case of discovery of biological and cognitive correlates of subgroups in an independent sample after the grouping model was transferred from a large heterogeneous data set to the target sample with missing modalities. We expect that the biologically motivated data-driven stratification framework enables the discovery of variants of manifestations of brain pathologies and subsequently increased accuracy in estimating the disease progression. When applied to specific research question, the grouping can lead to the discovery of correlates (for instance a protecting factor) that are present in a certain subgroup. In the long term, refined tools and methods from this project could help to identify non-trivial characteristics that determine the efficacy of an intervention