Paquet Eric, Soleymani Farzan, Viktor Herna Lydia, Michalowski Wojtek
National Research Council, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada.
School of Electrical Engineering and Computer Science, University of Ottawa, ON, K1N 6N5, Canada.
Comput Struct Biotechnol J. 2024 Apr 17;23:1641-1653. doi: 10.1016/j.csbj.2024.04.009. eCollection 2024 Dec.
Protein generation has numerous applications in designing therapeutic antibodies and creating new drugs. Still, it is a demanding task due to the inherent complexities of protein structures and the limitations of current generative models. Proteins possess intricate geometry, and sampling their conformational space is challenging due to its high dimensionality. This paper introduces novel Markovian and non-Markovian generative diffusion models based on fractional stochastic differential equations and the Lévy distribution, allowing for a more effective exploration of the conformational space. The approach is applied to a dataset of proteins and evaluated in terms of Fréchet distance, fidelity, and diversity, outperforming the state-of-the-art by 25.4%, 35.8%, and 11.8%, respectively.
蛋白质生成在设计治疗性抗体和研发新药方面有众多应用。然而,由于蛋白质结构固有的复杂性以及当前生成模型的局限性,这仍是一项艰巨的任务。蛋白质具有复杂的几何形状,因其构象空间的高维性,对其进行采样具有挑战性。本文基于分数阶随机微分方程和 Lévy 分布引入了新颖的马尔可夫和非马尔可夫生成扩散模型,从而能够更有效地探索构象空间。该方法应用于一个蛋白质数据集,并在弗雷歇距离、保真度和多样性方面进行了评估,分别比现有技术水平高出 25.4%、35.8% 和 11.8%。