Suppr超能文献

一种使用遗传编程将蛋白质片段分类为跨膜结构域的计算机程序的演化

Evolution of a computer program for classifying protein segments as transmembrane domains using genetic programming.

作者信息

Koza J R

机构信息

Computer Science Department, Stanford University, CA 94305-2140, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1994;2:244-52.

PMID:7584397
Abstract

The recently-developed genetic programming paradigm is used to evolve a computer program to classify a given protein segment as being a transmembrane domain or non-transmembrane area of the protein. Genetic programming starts with a primordial ooze of randomly generated computer programs composed of available programmatic ingredients and then genetically breeds the population of programs using the Darwinian principle of survival of the fittest and an analog of the naturally occurring genetic operation of crossover (sexual recombination). Automatic function definition enables genetic programming to dynamically create subroutines dynamically during the run. Genetic programming is given a training set of differently-sized protein segments and their correct classification (but no biochemical knowledge, such as hydrophobicity values). Correlation is used as the fitness measure to drive the evolutionary process. The best genetically-evolved program achieves an out-of-sample correlation of 0.968 and an out-of-sample error rate of 1.6%. This error rate is better than that reported for four other algorithms reported at the First International Conference on Intelligent Systems for Molecular Biology. Our genetically evolved program is an instance of an algorithm discovered by an automated learning paradigm that is superior to that written by human investigators.

摘要

最近开发的遗传编程范式被用于演化一个计算机程序,以将给定的蛋白质片段分类为该蛋白质的跨膜结构域或非跨膜区域。遗传编程从由可用编程成分组成的随机生成的计算机程序的原始汤开始,然后使用适者生存的达尔文原理和交叉(有性重组)这一自然发生的遗传操作的类似物对程序群体进行遗传培育。自动函数定义使遗传编程能够在运行期间动态地动态创建子例程。遗传编程被给予一组不同大小的蛋白质片段及其正确分类的训练集(但没有生化知识,如疏水性值)。相关性被用作适应度度量来驱动进化过程。最佳的遗传演化程序实现了0.968的样本外相关性和1.6%的样本外错误率。这个错误率优于在第一届国际分子生物学智能系统会议上报道的其他四种算法的错误率。我们的遗传演化程序是由一种自动学习范式发现的算法的一个实例,该算法优于人类研究者编写的算法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验