Back to overview

Crystal structure representations for machine learning models of formation energies

Type of publication Peer-reviewed
Publikationsform Original article (peer-reviewed)
Author Felix Faber Alexander Lindmaa O. Anatole von Lilienfeld Rickard Armiento,
Project From atomistic exploration of chemical compound space towards bio-molecular design: Quantum mechanical rational compound design (QM-RCD)
Show all

Original article (peer-reviewed)

Journal International Journal of Quantum Chemistry
Page(s) 1094
Title of proceedings International Journal of Quantum Chemistry
DOI 10.1002/qua.24917


We introduce and evaluate a set of feature vector representations of crystal structures for machine learning (ML) models of formation energies of solids. ML models of atomization energies of organic molecules have been successful using a Coulomb matrix representation of the molecule. We consider three ways to generalize such representations to periodic systems: (i) a matrix where each element is related to the Ewald sum of the electrostatic interaction between two different atoms in the unit cell repeated over the lattice; (ii) an extended Coulomb-like matrix that takes into account a number of neighboring unit cells; and (iii) an ansatz that mimics the periodicity and the basic features of the elements in the Ewald sum matrix using a sine function of the crystal coordinates of the atoms. The representations are compared for a Laplacian kernel with Manhattan norm, trained to reproduce formation energies using a dataset of 3938 crystal structures obtained from the Materials Project. For training sets consisting of 3000 crystals, the generalization error in predicting formation energies of new structures corresponds to (i) 0.49, (ii) 0.64, and (iii)  for the respective representations.