-
Natural sciences
- Inorganic chemistry
- Organic chemistry
- Theoretical and computational chemistry
- Other chemical sciences
Statistical-learning approaches are emerging as powerful alternatives to expensive computational methods for solving the Schrödinger equation to determine molecular properties. Despite the recent success of methods like neural networks, these models are only suitable for interpolation and fail to scale to larger systems. That is, when a model is trained on small-to-medium-size molecules, it can only be applied to systems of similar size. Modeling long-range intermolecular interactions with machine learning (ML) requires sampling the vast diversity of chemical environments occurring on an extended length scale. This leads to a combinatorial explosion in the required amount of training data. To circumvent these obstacles, I propose to incorporate our physical knowledge of long-range interactions into the modeling process; this is philosophically different from the commonly used
“black box ML modeling”. A detailed analysis of the proposed model reveals that it can achieve the accuracy of high-level quantum chemistry at the cost of molecular mechanics. This not only allows one to compute interaction energies of large molecules (e.g. drug-target binding) and run longtime
molecular dynamics simulations of (macro)molecules, but also enables accurate and efficient computational screening of large databases to select the most promising molecules for follow-up experiments. Besides its transformative utility, this pioneering strategy is extendable to many other problems in chemistry.