Suppr超能文献

使用 cLoops 进行 3D 基因组数据的精确环调用。

Accurate loop calling for 3D genomic data with cLoops.

机构信息

CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.

Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing 100871, China.

出版信息

Bioinformatics. 2020 Feb 1;36(3):666-675. doi: 10.1093/bioinformatics/btz651.

Abstract

MOTIVATION

Sequencing-based 3D genome mapping technologies can identify loops formed by interactions between regulatory elements hundreds of kilobases apart. Existing loop-calling tools are mostly restricted to a single data type, with accuracy dependent on a predefined resolution contact matrix or called peaks, and can have prohibitive hardware costs.

RESULTS

Here, we introduce cLoops ('see loops') to address these limitations. cLoops is based on the clustering algorithm cDBSCAN that directly analyzes the paired-end tags (PETs) to find candidate loops and uses a permuted local background to estimate statistical significance. These two data-type-independent processes enable loops to be reliably identified for both sharp and broad peak data, including but not limited to ChIA-PET, Hi-C, HiChIP and Trac-looping data. Loops identified by cLoops showed much less distance-dependent bias and higher enrichment relative to local regions than existing tools. Altogether, cLoops improves accuracy of detecting of 3D-genomic loops from sequencing data, is versatile, flexible, efficient, and has modest hardware requirements.

AVAILABILITY AND IMPLEMENTATION

cLoops with documentation and example data are freely available at: https://github.com/YaqiangCao/cLoops.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

基于测序的三维基因组图谱技术可以识别数百千碱基之外的调控元件之间形成的环。现有的环调用工具大多仅限于单一数据类型,其准确性取决于预定义的分辨率接触矩阵或调用峰,并且可能具有过高的硬件成本。

结果

在这里,我们引入了 cLoops(“see loops”)来解决这些限制。cLoops 基于聚类算法 cDBSCAN,该算法直接分析成对末端标签 (PETs) 以找到候选环,并使用置换局部背景来估计统计显著性。这两个与数据类型无关的过程可确保在尖锐和广泛的峰数据(包括但不限于 ChIA-PET、Hi-C、HiChIP 和 Trac-looping 数据)中可靠地识别环。与现有工具相比,cLoops 识别的环的距离依赖性偏差更小,富集度更高。总的来说,cLoops 提高了从测序数据中检测三维基因组环的准确性,具有多功能性、灵活性、高效性,并且硬件要求适度。

可用性和实现

带有文档和示例数据的 cLoops 可在以下网址免费获取:https://github.com/YaqiangCao/cLoops。

补充信息

补充数据可在生物信息学在线获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验