What Molecular Structure Cannot Tell Us: A Taxonomy of Explainability Gaps in GNN-Based Drug Toxicity Prediction
Why drug structure alone can't predict all side effects
Graph neural networks, which learn from a drug's molecular structure, can predict only about 45% of known side effects—even for well-studied drugs like aspirin. The missing 55% falls into predictable categories: effects that no molecule structure can encode, data gaps from incomplete testing, mismatches between what's measured and what's toxic, and errors in how the neural network represents chemistry.
Drug regulators and safety teams currently rely on computational models to catch rare side effects before they harm patients. This research shows those models have a hard ceiling—knowing a drug's molecular structure isn't enough. Understanding where that ceiling is lets regulators know when they need additional testing, human expertise, or real-world monitoring instead of trusting predictions that might miss real dangers.