Suppr超能文献

亚历山德罗斯 PS:用于自动检测直系同源基因簇和随后的正选择分析的用户友好型流水线。

AlexandrusPS: A User-Friendly Pipeline for the Automated Detection of Orthologous Gene Clusters and Subsequent Positive Selection Analysis.

机构信息

Institute of Molecular Biology (IMB), Quantitative Proteomics, Mainz, Germany.

Institute of Human Genetics, University Medical Center of the Johannes Gutenberg University Mainz, Department of Human Genetics, Mainz, Germany.

出版信息

Genome Biol Evol. 2023 Oct 6;15(10). doi: 10.1093/gbe/evad187.

Abstract

The detection of adaptive selection in a system approach considering all protein-coding genes allows for the identification of mechanisms and pathways that enabled adaptation to different environments. Currently, available programs for the estimation of positive selection signals can be divided into two groups. They are either easy to apply but can analyze only one gene family at a time, restricting system analysis; or they can handle larger cohorts of gene families, but require considerable prerequisite data such as orthology associations, codon alignments, phylogenetic trees, and proper configuration files. All these steps require extensive computational expertise, restricting this endeavor to specialists. Here, we introduce AlexandrusPS, a high-throughput pipeline that overcomes technical challenges when conducting transcriptome-wide positive selection analyses on large sets of nucleotide and protein sequences. The pipeline streamlines 1) the execution of an accurate orthology prediction as a precondition for positive selection analysis, 2) preparing and organizing configuration files for CodeML, 3) performing positive selection analysis using CodeML, and 4) generating an output that is easy to interpret, including all maximum likelihood and log-likelihood test results. The only input needed from the user is the CDS and peptide FASTA files of proteins of interest. The pipeline is provided in a Docker image, requiring no program or module installation, enabling the application of the pipeline in any computing environment. AlexandrusPS and its documentation are available via GitHub (https://github.com/alejocn5/AlexandrusPS).

摘要

从系统的角度检测所有编码蛋白基因的适应性选择,可以识别使生物适应不同环境的机制和途径。目前,用于估计正选择信号的可用程序可分为两类。一类易于应用,但每次只能分析一个基因家族,限制了系统分析;另一类可以处理更大的基因家族队列,但需要大量的前置数据,如同源物关联、密码子比对、系统发育树和适当的配置文件。所有这些步骤都需要广泛的计算专业知识,这使得这项工作仅限于专家。在这里,我们介绍了 AlexandrusPS,这是一个高通量的管道,当对大量核苷酸和蛋白质序列进行转录组范围内的正选择分析时,它克服了技术挑战。该管道简化了 1)准确同源预测的执行,作为正选择分析的前提条件,2)准备和组织 CodeML 的配置文件,3)使用 CodeML 进行正选择分析,以及 4)生成易于解释的输出,包括所有最大似然和对数似然检验结果。用户唯一需要输入的是感兴趣的蛋白质的 CDS 和肽 FASTA 文件。该管道以 Docker 镜像的形式提供,无需程序或模块安装,可在任何计算环境中应用该管道。AlexandrusPS 及其文档可在 GitHub(https://github.com/alejocn5/AlexandrusPS)上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/529b/10612477/28e92975339a/evad187f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验