Li Xiaomei, Wang Nengchao
Computer Science and Technology Institute, Huazhong University of Science and Technology, Wuhan 430074, China.
Genomics Proteomics Bioinformatics. 2004 Nov;2(4):245-52. doi: 10.1016/s1672-0229(04)02031-5.
Using a triangular lattice model to study the designability of protein folding, we overcame the parity problem of previous cubic lattice model and enumerated all the sequences and compact structures on a simple two-dimensional triangular lattice model of size 4+5+6+5+4. We used two types of amino acids, hydrophobic and polar, to make up the sequences, and achieved 2(23)+2(12) different sequences excluding the reverse symmetry sequences. The total string number of distinct compact structures was 219,093, excluding reflection symmetry in the self-avoiding path of length 24 triangular lattice model. Based on this model, we applied a fast search algorithm by constructing a cluster tree. The algorithm decreased the computation by computing the objective energy of non-leaf nodes. The parallel experiments proved that the fast tree search algorithm yielded an exponential speed-up in the model of size 4+5+6+5+4. Designability analysis was performed to understand the search result.
我们使用三角晶格模型来研究蛋白质折叠的可设计性,克服了先前立方晶格模型的奇偶问题,并在一个大小为4+5+6+5+4的简单二维三角晶格模型上枚举了所有序列和紧密结构。我们使用疏水和极性两种类型的氨基酸来组成序列,在排除反向对称序列后,得到了2(23)+2(12)种不同的序列。在长度为24的三角晶格模型的自回避路径中排除反射对称后,不同紧密结构的总序列数为219,093。基于此模型,我们通过构建聚类树应用了一种快速搜索算法。该算法通过计算非叶节点的目标能量来减少计算量。并行实验证明,快速树搜索算法在大小为4+5+6+5+4的模型中实现了指数级加速。为了理解搜索结果,我们进行了可设计性分析。