Suppr超能文献

coiTAD:基于 Hi-C 数据中环化影响特征聚类的拓扑关联域检测。

coiTAD: Detection of Topologically Associating Domains Based on Clustering of Circular Influence Features from Hi-C Data.

机构信息

Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, USA.

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, Aurora, CO 80045, USA.

出版信息

Genes (Basel). 2024 Sep 30;15(10):1293. doi: 10.3390/genes15101293.

Abstract

BACKGROUND/OBJECTIVES: Topologically associating domains (TADs) are key structural units of the genome, playing a crucial role in gene regulation. TAD boundaries are enriched with specific biological markers and have been linked to genetic diseases, making consistent TAD detection essential. However, accurately identifying TADs remains challenging due to the lack of a definitive validation method. This study aims to develop a novel algorithm, termed coiTAD, which introduces an innovative approach for preprocessing Hi-C data to improve TAD prediction. This method employs a proposed "circle of influence" (COI) approach derived from Hi-C contact matrices.

METHODS

The coiTAD algorithm is based on the creation of novel features derived from the circle of influence in input contact matrices, which are subsequently clustered using the HDBSCAN clustering algorithm. The TADs are extracted from the clustered features based on intra-cluster interactions, thereby providing a more accurate method for identifying TADs.

RESULTS

Rigorous tests were conducted using both simulated and real Hi-C datasets. The algorithm's validation included analysis of boundary proteins such as H3K4me1, RNAPII, and CTCF. coiTAD consistently matched other TAD prediction methods.

CONCLUSIONS

The coiTAD algorithm represents a novel approach for detecting TADs. At its core, the circle-of-influence methodology introduces an innovative strategy for preparing Hi-C data, enabling the assessment of interaction strengths between genomic regions. This approach facilitates a nuanced analysis that effectively captures structural variations within chromatin. Ultimately, the coiTAD algorithm enhances our understanding of chromatin organization and offers a robust tool for genomic research. The source code for coiTAD is publicly available, and the URL can be found in the Data Availability Statement section.

摘要

背景/目的:拓扑关联域(TAD)是基因组的关键结构单元,在基因调控中起着至关重要的作用。TAD 边界富含特定的生物标志物,并与遗传疾病有关,因此一致地检测 TAD 至关重要。然而,由于缺乏明确的验证方法,准确识别 TAD 仍然具有挑战性。本研究旨在开发一种新算法,称为 coiTAD,该算法引入了一种创新的预处理 Hi-C 数据的方法,以提高 TAD 预测的准确性。该方法采用了一种基于 Hi-C 接触矩阵的“影响圈”(COI)方法。

方法

coiTAD 算法基于从输入接触矩阵中的影响圈中衍生的新特征的创建,这些特征随后使用 HDBSCAN 聚类算法进行聚类。基于聚类特征之间的相互作用提取 TAD,从而提供了一种更准确的 TAD 识别方法。

结果

使用模拟和真实的 Hi-C 数据集进行了严格的测试。该算法的验证包括分析边界蛋白,如 H3K4me1、RNAPII 和 CTCF。coiTAD 与其他 TAD 预测方法一致。

结论

coiTAD 算法代表了一种检测 TAD 的新方法。其核心是影响圈方法,该方法为准备 Hi-C 数据引入了一种创新策略,能够评估基因组区域之间的相互作用强度。这种方法促进了对染色质结构变化的细致分析。最终,coiTAD 算法增强了我们对染色质组织的理解,并为基因组研究提供了一个强大的工具。coiTAD 的源代码是公开可用的,URL 可以在数据可用性声明部分找到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cae4/11507547/36e12b2e80f4/genes-15-01293-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验