• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

减去误差:在存在残留比对误差的情况下检测正选择。

Minus the Error: Testing for Positive Selection in the Presence of Residual Alignment Errors.

作者信息

Selberg Avery, Clark Nathan L, Sackton Timothy B, Muse Spencer V, Lucaci Alexander G, Weaver Steven, Nekrutenko Anton, Chikina Maria, Pond Sergei L Kosakovsky

机构信息

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.

Department of Biology, Temple University, Philadelphia, PA, USA.

出版信息

bioRxiv. 2025 Mar 21:2024.11.13.620707. doi: 10.1101/2024.11.13.620707.

DOI:10.1101/2024.11.13.620707
PMID:39605407
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11601313/
Abstract

Positive selection is an evolutionary process which increases the frequency of advantageous mutations because they confer a fitness benefit. Inferring the past action of positive selection on protein-coding sequences is fundamental for deciphering phenotypic diversity and the emergence of novel traits. With the advent of genome-wide comparative genomic datasets, researchers can analyze selection not only at the level of individual genes but also globally, delivering systems-level insights into evolutionary dynamics. However, genome-scale datasets are generated with automated pipelines and imperfect curation that does not eliminate all sequencing, annotation, and alignment errors. Positive selection inference methods are highly sensitive to such errors. We present BUSTED-E: a method designed to detect positive selection for amino acid diversification while concurrently identifying some alignment errors. This method builds on the flexible branch-site random effects model (BUSTED) for fitting distributions of dN/dS, with a critical modification: it incorporates an "error-sink" component to represent an abiological evolutionary regime. Using several genome-scale biological datasets that were extensively filtered using state-of-the art automated alignment tools, we show that BUSTED-E identifies pervasive residual alignment errors, produces more realistic estimates of positive selection, reduces bias, and improves biological interpretation. The BUSTED-E model promises to be a more stringent filter to identify positive selection in genome-wide contexts, thus enabling further characterization and validation of the most biologically relevant cases.

摘要

正向选择是一个进化过程,它会增加有利突变的频率,因为这些突变会带来适应性优势。推断过去正向选择对蛋白质编码序列的作用,对于解读表型多样性和新性状的出现至关重要。随着全基因组比较基因组数据集的出现,研究人员不仅可以在单个基因层面分析选择,还能进行全局分析,从而提供系统层面的进化动力学见解。然而,基因组规模的数据集是通过自动化流程生成的,且整理并不完善,无法消除所有测序、注释和比对错误。正向选择推断方法对这类错误高度敏感。我们提出了BUSTED-E:一种旨在检测氨基酸多样化的正向选择,同时识别一些比对错误的方法。该方法基于灵活的分支位点随机效应模型(BUSTED)来拟合dN/dS的分布,并进行了关键修改:它纳入了一个“错误汇”组件来代表非生物进化模式。使用几个经过最先进的自动比对工具广泛过滤的基因组规模生物数据集,我们表明BUSTED-E能够识别普遍存在的残留比对错误,产生更现实的正向选择估计值,减少偏差,并改善生物学解释。BUSTED-E模型有望成为在全基因组背景下识别正向选择的更严格过滤器,从而能够进一步表征和验证最具生物学相关性的案例。

相似文献

1
Minus the Error: Testing for Positive Selection in the Presence of Residual Alignment Errors.减去误差:在存在残留比对误差的情况下检测正选择。
bioRxiv. 2025 Mar 21:2024.11.13.620707. doi: 10.1101/2024.11.13.620707.
2
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
3
Comparison of the Chinese bamboo partridge and red Junglefowl genome sequences highlights the importance of demography in genome evolution.比较中国竹鸡和红原鸡的基因组序列,突出了种群动态在基因组进化中的重要性。
BMC Genomics. 2018 May 8;19(1):336. doi: 10.1186/s12864-018-4711-0.
4
Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses.多核苷酸替换的进化捷径及其对自然选择分析的影响。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad150.
5
Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences.评估对齐过滤方法在减少错误对进化推断影响方面的有用性。
BMC Evol Biol. 2019 Jan 11;19(1):21. doi: 10.1186/s12862-019-1350-2.
6
Shifting Balance on a Static Mutation-Selection Landscape: A Novel Scenario of Positive Selection.静态突变-选择景观上的平衡转移:正选择的一种新情景。
Mol Biol Evol. 2017 Feb 1;34(2):391-407. doi: 10.1093/molbev/msw237.
7
Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty.进化替换模型的相对模型选择可能对多重序列比对的不确定性敏感。
BMC Ecol Evol. 2021 Nov 29;21(1):214. doi: 10.1186/s12862-021-01931-5.
8
Erasing errors due to alignment ambiguity when estimating positive selection.在估计正选择时消除由于比对歧义导致的错误。
Mol Biol Evol. 2014 Aug;31(8):1979-93. doi: 10.1093/molbev/msu174. Epub 2014 May 27.
9
POTION: an end-to-end pipeline for positive Darwinian selection detection in genome-scale data through phylogenetic comparison of protein-coding genes.POTION:一种通过蛋白质编码基因的系统发育比较在基因组规模数据中检测正向达尔文选择的端到端流程。
BMC Genomics. 2015 Aug 1;16(1):567. doi: 10.1186/s12864-015-1765-0.
10
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

本文引用的文献

1
Evolutionary Shortcuts via Multinucleotide Substitutions and Their Impact on Natural Selection Analyses.多核苷酸替换的进化捷径及其对自然选择分析的影响。
Mol Biol Evol. 2023 Jul 5;40(7). doi: 10.1093/molbev/msad150.
2
Understanding the evolution of immune genes in jawed vertebrates.了解有颌脊椎动物免疫基因的演化。
J Evol Biol. 2023 Jun;36(6):847-873. doi: 10.1111/jeb.14181. Epub 2023 May 31.
3
A Mutation-Selection Model of Protein Evolution under Persistent Positive Selection.持久正选择下的蛋白质进化突变选择模型。
Mol Biol Evol. 2022 Jan 7;39(1). doi: 10.1093/molbev/msab309.
4
Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes.额外的安打:对瞬时多核苷酸变化的广泛实证支持。
PLoS One. 2021 Mar 12;16(3):e0248337. doi: 10.1371/journal.pone.0248337. eCollection 2021.
5
A comparative genomics multitool for scientific discovery and conservation.用于科学发现和保护的比较基因组学多用途工具。
Nature. 2020 Nov;587(7833):240-245. doi: 10.1038/s41586-020-2876-6. Epub 2020 Nov 11.
6
Studying Natural Selection in the Era of Ubiquitous Genomes.研究普遍存在的基因组时代的自然选择。
Trends Genet. 2020 Oct;36(10):792-803. doi: 10.1016/j.tig.2020.07.008. Epub 2020 Aug 13.
7
Synonymous Site-to-Site Substitution Rate Variation Dramatically Inflates False Positive Rates of Selection Analyses: Ignore at Your Own Peril.同义站点间替换率的变化极大地夸大了选择分析的假阳性率:忽视后果自负。
Mol Biol Evol. 2020 Aug 1;37(8):2430-2439. doi: 10.1093/molbev/msaa037.
8
HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies.HyPhy 2.5-A 可定制的基于系统发生树的进化假说检验平台。
Mol Biol Evol. 2020 Jan 1;37(1):295-299. doi: 10.1093/molbev/msz197.
9
Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates.当存在多核苷酸突变和多个非同义速率时,广义参数密码子模型下的位点特异性正选择的推断得到改进。
BMC Evol Biol. 2019 Jan 14;19(1):22. doi: 10.1186/s12862-018-1326-7.
10
Evaluating the usefulness of alignment filtering methods to reduce the impact of errors on evolutionary inferences.评估对齐过滤方法在减少错误对进化推断影响方面的有用性。
BMC Evol Biol. 2019 Jan 11;19(1):21. doi: 10.1186/s12862-019-1350-2.