Buehler Markus J
Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
Center for Computational Science and Engineering, Schwarzman College of Computing, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
Patterns (N Y). 2023 Feb 14;4(3):100692. doi: 10.1016/j.patter.2023.100692. eCollection 2023 Mar 10.
Taking inspiration from nature about how to design materials has been a fruitful approach, used by humans for millennia. In this paper we report a method that allows us to discover how patterns in disparate domains can be reversibly related using a computationally rigorous approach, the AttentionCrossTranslation model. The algorithm discovers cycle- and self-consistent relationships and offers a bidirectional translation of information across disparate knowledge domains. The approach is validated with a set of known translation problems, and then used to discover a mapping between musical data-based on the corpus of note sequences in J.S. Bach's Goldberg Variations created in 1741-and protein sequence data-information sampled more recently. Using protein folding algorithms, 3D structures of the predicted protein sequences are generated, and their stability is validated using explicit solvent molecular dynamics. Musical scores generated from protein sequences are sonified and rendered into audible sound.
从自然中获取关于如何设计材料的灵感一直是一种富有成效的方法,人类已经使用了数千年。在本文中,我们报告了一种方法,该方法使我们能够使用一种计算严谨的方法——AttentionCrossTranslation模型,发现不同领域中的模式如何可逆地关联起来。该算法发现循环且自洽的关系,并提供跨不同知识领域的双向信息翻译。该方法通过一组已知的翻译问题进行了验证,然后用于发现基于1741年创作的J.S.巴赫《哥德堡变奏曲》音符序列语料库的音乐数据与最近采样的蛋白质序列数据信息之间的映射。使用蛋白质折叠算法生成预测蛋白质序列的三维结构,并使用显式溶剂分子动力学验证其稳定性。从蛋白质序列生成的乐谱被进行了声音化处理并转化为可听见的声音。