Suppr超能文献

整合单词的重叠结构和背景信息能显著提高生物序列比较的效果。

Integrating overlapping structures and background information of words significantly improves biological sequence comparison.

机构信息

College of Life Sciences, Zhejiang Sci-Tech University, Hangzhou, People's Republic of China.

出版信息

PLoS One. 2011;6(11):e26779. doi: 10.1371/journal.pone.0026779. Epub 2011 Nov 10.

Abstract

Word-based models have achieved promising results in sequence comparison. However, as the important statistical properties of words in biological sequence, how to use the overlapping structures and background information of the words to improve sequence comparison is still a problem. This paper proposed a new statistical method that integrates the overlapping structures and the background information of the words in biological sequences. To assess the effectiveness of this integration for sequence comparison, two sets of evaluation experiments were taken to test the proposed model. The first one, performed via receiver operating curve analysis, is the application of proposed method in discrimination between functionally related regulatory sequences and unrelated sequences, intron and exon. The second experiment is to evaluate the performance of the proposed method with f-measure for clustering Hepatitis E virus genotypes. It was demonstrated that the proposed method integrating the overlapping structures and the background information of words significantly improves biological sequence comparison and outperforms the existing models.

摘要

基于词的模型在序列比较中取得了令人瞩目的成果。然而,作为生物序列中词的重要统计属性,如何利用词的重叠结构和背景信息来改进序列比较仍然是一个问题。本文提出了一种新的统计方法,该方法集成了生物序列中词的重叠结构和背景信息。为了评估这种集成对序列比较的有效性,进行了两组评估实验来测试所提出的模型。第一个实验是通过接收者操作曲线分析进行的,即将所提出的方法应用于区分功能相关的调控序列和不相关的序列、内含子和外显子。第二个实验是通过 f 测度评估所提出的方法在聚类丙型肝炎病毒基因型方面的性能。结果表明,该方法显著提高了生物序列比较的性能,并优于现有的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb15/3213098/3857eaefa172/pone.0026779.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验