Suppr超能文献

构建和评估共识基因组区间集的方法。

Methods for constructing and evaluating consensus genomic interval sets.

机构信息

Department of Genome Sciences, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA.

Department of Computer Science, School of Engineering, University of Virginia, Charlottesville, VA 22908, USA.

出版信息

Nucleic Acids Res. 2024 Sep 23;52(17):10119-10131. doi: 10.1093/nar/gkae685.

Abstract

The amount of genomic region data continues to increase. Integrating across diverse genomic region sets requires consensus regions, which enable comparing regions across experiments, but also by necessity lose precision in region definitions. We require methods to assess this loss of precision and build optimal consensus region sets. Here, we introduce the concept of flexible intervals and propose three novel methods for building consensus region sets, or universes: a coverage cutoff method, a likelihood method, and a Hidden Markov Model. We then propose three novel measures for evaluating how well a proposed universe fits a collection of region sets: a base-level overlap score, a region boundary distance score, and a likelihood score. We apply our methods and evaluation approaches to several collections of region sets and show how these methods can be used to evaluate fit of universes and build optimal universes. We describe scenarios where the common approach of merging regions to create consensus leads to undesirable outcomes and provide principled alternatives that provide interoperability of interval data while minimizing loss of resolution.

摘要

基因组区域数据的数量持续增加。整合不同的基因组区域集需要共识区域,这使得能够跨实验比较区域,但也不可避免地降低区域定义的精度。我们需要方法来评估这种精度损失,并构建最优的共识区域集。在这里,我们引入了灵活区间的概念,并提出了三种构建共识区域集或宇宙的新方法:覆盖截止方法、似然方法和隐马尔可夫模型。然后,我们提出了三种新的度量标准,用于评估提议的宇宙与一组区域集的拟合程度:基本重叠得分、区域边界距离得分和似然得分。我们将这些方法和评估方法应用于几集区域集,并展示了如何使用这些方法来评估宇宙的拟合程度和构建最优的宇宙。我们描述了常见的合并区域以创建共识的方法导致不理想结果的情况,并提供了原则性的替代方法,这些方法在最小化分辨率损失的同时提供了区间数据的互操作性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d602/11417377/a5a560f6a359/gkae685figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验