Suppr超能文献

PosiGene:用于全基因组范围内检测正选择基因的自动化且易于使用的流程。

PosiGene: automated and easy-to-use pipeline for genome-wide detection of positively selected genes.

作者信息

Sahm Arne, Bens Martin, Platzer Matthias, Szafranski Karol

机构信息

Leibniz Institute on Aging, Fritz Lipmann Institute, 07745 Jena, Germany.

出版信息

Nucleic Acids Res. 2017 Jun 20;45(11):e100. doi: 10.1093/nar/gkx179.

Abstract

Many comparative genomics studies aim to find the genetic basis of species-specific phenotypic traits. A prevailing strategy is to search genome-wide for genes that evolved under positive selection based on the non-synonymous to synonymous substitution ratio. However, incongruent results largely due to high false positive rates indicate the need for standardization of quality criteria and software tools. Main challenges are the ortholog and isoform assignment, the high sensitivity of the statistical models to alignment errors and the imperative to parallelize large parts of the software. We developed the software tool PosiGene that (i) detects positively selected genes (PSGs) on genome-scale, (ii) allows analysis of specific evolutionary branches, (iii) can be used in arbitrary species contexts and (iv) offers visualization of the results for further manual validation and biological interpretation. We exemplify PosiGene's performance using simulated and real data. In the simulated data approach, we determined a false positive rate <1%. With real data, we found that 68.4% of the PSGs detected by PosiGene, were shared by at least one previous study that used the same set of species. PosiGene is a user-friendly, reliable tool for reproducible genome-wide identification of PSGs and freely available at https://github.com/gengit/PosiGene.

摘要

许多比较基因组学研究旨在寻找物种特异性表型特征的遗传基础。一种流行的策略是基于非同义替换与同义替换比率,在全基因组范围内搜索经历正选择进化的基因。然而,由于高假阳性率导致的不一致结果表明,需要对质量标准和软件工具进行标准化。主要挑战包括直系同源基因和异构体的分配、统计模型对序列比对错误的高敏感性,以及软件大部分内容进行并行化处理的必要性。我们开发了软件工具PosiGene,它(i)在基因组规模上检测正选择基因(PSG),(ii)允许分析特定的进化分支,(iii)可用于任意物种背景,(iv)提供结果可视化,以便进一步人工验证和生物学解释。我们使用模拟数据和真实数据举例说明了PosiGene的性能。在模拟数据方法中,我们确定假阳性率<1%。使用真实数据时,我们发现PosiGene检测到的PSG中有68.4%至少被一项使用相同物种集的先前研究所共享。PosiGene是一个用户友好、可靠的工具,可用于全基因组范围内可重复地识别PSG,可在https://github.com/gengit/PosiGene上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2c1/5499814/22ff02bc0c1e/gkx179fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验