• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无需对中性替代率进行先验假设的局部保守性评分。

Local conservation scores without a priori assumptions on neutral substitution rates.

作者信息

Dingel Janis, Hanus Pavol, Leonardi Niccolò, Hagenauer Joachim, Zech Jürgen, Mueller Jakob C

机构信息

Institute for Communications Engineering, Technische Universität München, Munich, Germany.

出版信息

BMC Bioinformatics. 2008 Apr 11;9:190. doi: 10.1186/1471-2105-9-190.

DOI:10.1186/1471-2105-9-190
PMID:18405366
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2375903/
Abstract

BACKGROUND

Comparative genomics aims to detect signals of evolutionary conservation as an indicator of functional constraint. Surprisingly, results of the ENCODE project revealed that about half of the experimentally verified functional elements found in non-coding DNA were classified as unconstrained by computational predictions. Following this observation, it has been hypothesized that this may be partly explained by biased estimates on neutral evolutionary rates used by existing sequence conservation metrics. All methods we are aware of rely on a comparison with the neutral rate and conservation is estimated by measuring the deviation of a particular genomic region from this rate. Consequently, it is a reasonable assumption that inaccurate neutral rate estimates may lead to biased conservation and constraint estimates.

RESULTS

We propose a conservation signal that is produced by local Maximum Likelihood estimation of evolutionary parameters using an optimized sliding window and present a Kullback-Leibler projection that allows multiple different estimated parameters to be transformed into a conservation measure. This conservation measure does not rely on assumptions about neutral evolutionary substitution rates and little a priori assumptions on the properties of the conserved regions are imposed. We show the accuracy of our approach (KuLCons) on synthetic data and compare it to the scores generated by state-of-the-art methods (phastCons, GERP, SCONE) in an ENCODE region. We find that KuLCons is most often in agreement with the conservation/constraint signatures detected by GERP and SCONE while qualitatively very different patterns from phastCons are observed. Opposed to standard methods KuLCons can be extended to more complex evolutionary models, e.g. taking insertion and deletion events into account and corresponding results show that scores obtained under this model can diverge significantly from scores using the simpler model.

CONCLUSION

Our results suggest that discriminating among the different degrees of conservation is possible without making assumptions about neutral rates. We find, however, that it cannot be expected to discover considerably different constraint regions than GERP and SCONE. Consequently, we conclude that the reported discrepancies between experimentally verified functional and computationally identified constraint elements are likely not to be explained by biased neutral rate estimates.

摘要

背景

比较基因组学旨在检测进化保守信号,以此作为功能限制的一个指标。令人惊讶的是,ENCODE项目的结果显示,在非编码DNA中发现的约一半经实验验证的功能元件,根据计算预测被归类为无限制的。基于这一观察结果,有人推测,这可能部分是由于现有序列保守性度量所使用的中性进化速率估计存在偏差。我们所知的所有方法都依赖于与中性速率的比较,并且通过测量特定基因组区域与该速率的偏差来估计保守性。因此,一个合理的假设是,不准确的中性速率估计可能导致有偏差的保守性和限制估计。

结果

我们提出了一种通过使用优化的滑动窗口对进化参数进行局部最大似然估计而产生的保守信号,并提出了一种Kullback-Leibler投影,它允许将多个不同的估计参数转换为一种保守性度量。这种保守性度量不依赖于关于中性进化替代率的假设,并且对保守区域的性质几乎没有先验假设。我们在合成数据上展示了我们方法(KuLCons)的准确性,并将其与ENCODE区域中最先进的方法(phastCons、GERP、SCONE)生成的分数进行比较。我们发现KuLCons最常与GERP和SCONE检测到的保守性/限制特征一致,同时观察到与phastCons在定性上非常不同的模式。与标准方法不同,KuLCons可以扩展到更复杂的进化模型,例如考虑插入和删除事件,相应的结果表明,在该模型下获得的分数可能与使用更简单模型时的分数有显著差异。

结论

我们的结果表明,在不做关于中性速率假设的情况下,区分不同程度的保守性是可能的。然而,我们发现,与GERP和SCONE相比,预计不会发现明显不同的限制区域。因此,我们得出结论,实验验证的功能元件与计算识别的限制元件之间报告的差异,可能无法用有偏差的中性速率估计来解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/934b1f4907d6/1471-2105-9-190-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/c7cb66283793/1471-2105-9-190-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/50bc73f6941c/1471-2105-9-190-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/45050330c41d/1471-2105-9-190-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/737ebe92fad4/1471-2105-9-190-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/934b1f4907d6/1471-2105-9-190-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/c7cb66283793/1471-2105-9-190-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/50bc73f6941c/1471-2105-9-190-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/45050330c41d/1471-2105-9-190-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/737ebe92fad4/1471-2105-9-190-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf5a/2375903/934b1f4907d6/1471-2105-9-190-5.jpg

相似文献

1
Local conservation scores without a priori assumptions on neutral substitution rates.无需对中性替代率进行先验假设的局部保守性评分。
BMC Bioinformatics. 2008 Apr 11;9:190. doi: 10.1186/1471-2105-9-190.
2
Analysis of sequence conservation at nucleotide resolution.在核苷酸分辨率下分析序列保守性。
PLoS Comput Biol. 2007 Dec;3(12):e254. doi: 10.1371/journal.pcbi.0030254. Epub 2007 Nov 14.
3
Detection of nonneutral substitution rates on mammalian phylogenies.检测哺乳动物系统发育上的非中性替代率。
Genome Res. 2010 Jan;20(1):110-21. doi: 10.1101/gr.097857.109. Epub 2009 Oct 26.
4
Identifying a high fraction of the human genome to be under selective constraint using GERP++.使用 GERP++ 鉴定人类基因组中受到选择压力的部分。
PLoS Comput Biol. 2010 Dec 2;6(12):e1001025. doi: 10.1371/journal.pcbi.1001025.
5
Vestige: maximum likelihood phylogenetic footprinting.痕迹:最大似然系统发育足迹法。
BMC Bioinformatics. 2005 May 29;6:130. doi: 10.1186/1471-2105-6-130.
6
Statistical power of phylo-HMM for evolutionarily conserved element detection.用于检测进化保守元件的系统发育隐马尔可夫模型的统计功效。
BMC Bioinformatics. 2007 Oct 5;8:374. doi: 10.1186/1471-2105-8-374.
7
Statistical alignment with a sequence evolution model allowing rate heterogeneity along the sequence.与允许序列沿序列存在速率异质性的序列进化模型进行统计比对。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Apr-Jun;6(2):281-95. doi: 10.1109/TCBB.2007.70246.
8
Strategies for measuring evolutionary conservation of RNA secondary structures.测量RNA二级结构进化保守性的策略。
BMC Bioinformatics. 2008 Feb 26;9:122. doi: 10.1186/1471-2105-9-122.
9
Extensively Parameterized Mutation-Selection Models Reliably Capture Site-Specific Selective Constraint.广泛参数化的突变选择模型可靠地捕捉位点特异性选择约束。
Mol Biol Evol. 2016 Nov;33(11):2990-3002. doi: 10.1093/molbev/msw171. Epub 2016 Aug 10.
10
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

本文引用的文献

1
Among-site rate variation and its impact on phylogenetic analyses.种间变异率及其对系统发育分析的影响。
Trends Ecol Evol. 1996 Sep;11(9):367-72. doi: 10.1016/0169-5347(96)10041-0.
2
Analysis of sequence conservation at nucleotide resolution.在核苷酸分辨率下分析序列保守性。
PLoS Comput Biol. 2007 Dec;3(12):e254. doi: 10.1371/journal.pcbi.0030254. Epub 2007 Nov 14.
3
Metrics of sequence constraint overlook regulatory sequences in an exhaustive analysis at phox2b.在对phox2b进行详尽分析时,序列约束指标忽略了调控序列。
Genome Res. 2008 Feb;18(2):252-60. doi: 10.1101/gr.6929408. Epub 2007 Dec 10.
4
How accurately is ncRNA aligned within whole-genome multiple alignments?非编码RNA(ncRNA)在全基因组多重比对中的比对准确性如何?
BMC Bioinformatics. 2007 Oct 26;8:417. doi: 10.1186/1471-2105-8-417.
5
Raising the estimate of functional human sequences.提高对功能性人类序列的估计。
Genome Res. 2007 Sep;17(9):1245-53. doi: 10.1101/gr.6406307. Epub 2007 Aug 9.
6
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.ENCODE试点项目对人类基因组1%的功能元件进行鉴定与分析。
Nature. 2007 Jun 14;447(7146):799-816. doi: 10.1038/nature05874.
7
Genome project turns up evolutionary surprises.基因组计划带来了进化方面的意外发现。
Nature. 2007 Jun 14;447(7146):760-1. doi: 10.1038/447760a.
8
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome.对1%人类基因组的深度哺乳动物序列比对和约束预测分析。
Genome Res. 2007 Jun;17(6):760-74. doi: 10.1101/gr.6034307.
9
Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment.Indelign:一种用于在多序列比对中注释插入和缺失的概率框架。
Bioinformatics. 2007 Feb 1;23(3):289-97. doi: 10.1093/bioinformatics/btl578. Epub 2006 Nov 15.
10
A large family of ancient repeat elements in the human genome is under strong selection.人类基因组中一个古老的重复元件大家族正受到强烈的选择作用。
Proc Natl Acad Sci U S A. 2006 Feb 21;103(8):2740-5. doi: 10.1073/pnas.0511238103. Epub 2006 Feb 13.