Suppr超能文献

从序列中提高 DNA 结合域的预测和理解能力。

Boosting the prediction and understanding of DNA-binding domains from sequence.

机构信息

Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60612, USA.

出版信息

Nucleic Acids Res. 2010 Jun;38(10):3149-58. doi: 10.1093/nar/gkq061. Epub 2010 Feb 15.

Abstract

DNA-binding proteins perform vital functions related to transcription, repair and replication. We have developed a new sequence-based machine learning protocol to identify DNA-binding proteins. We compare our method with an extensive benchmark of previously published structure-based machine learning methods as well as a standard sequence alignment technique, BLAST. Furthermore, we elucidate important feature interactions found in a learned model and analyze how specific rules capture general mechanisms that extend across DNA-binding motifs. This analysis is carried out using the malibu machine learning workbench available at http://proteomics.bioengr.uic.edu/malibu and the corresponding data sets and features are available at http://proteomics.bioengr.uic.edu/dna.

摘要

DNA 结合蛋白执行与转录、修复和复制相关的重要功能。我们开发了一种新的基于序列的机器学习协议来识别 DNA 结合蛋白。我们将我们的方法与广泛的先前发表的基于结构的机器学习方法的基准以及标准序列比对技术 BLAST 进行了比较。此外,我们阐明了在学习模型中发现的重要特征相互作用,并分析了特定规则如何捕获跨 DNA 结合基序延伸的一般机制。这项分析是使用可在 http://proteomics.bioengr.uic.edu/malibu 上获得的 malibu 机器学习工作台以及可在 http://proteomics.bioengr.uic.edu/dna 上获得的相应数据集和特征来进行的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7c3d/2879530/6f4418b2df4d/gkq061f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验