Suppr超能文献

通过两阶段算法重建三维基因组结构。

Reconstruction of 3D genome architecture via a two-stage algorithm.

作者信息

Segal Mark R, Bengtsson Henrik L

机构信息

Division of Bioinformatics, Department of Epidemiology and Biostatistics, University of California, 550 16th Street, San Francisco, 94158, CA, USA.

出版信息

BMC Bioinformatics. 2015 Nov 9;16:373. doi: 10.1186/s12859-015-0799-2.

Abstract

BACKGROUND

The three-dimensional (3D) configuration of chromosomes within the eukaryote nucleus is an important factor for several cellular functions, including gene expression regulation, and has also been linked with cancer-causing translocation events. While visualization of such architecture remains limited to low resolutions, the ability to infer structures at increasing resolutions has been enabled by recently-devised chromosome conformation capture techniques. In particular, when coupled with next generation sequencing, such methods yield an inventory of genome-wide chromatin contacts or interactions. Various algorithms have been advanced to operate on such contact data to produce reconstructed 3D configurations. Studies have shown that these reconstructions can provide added value over raw interaction data with respect to downstream biological insights. However, only limited, low-resolution reconstructions have been realized for mammals due to computational bottlenecks.

RESULTS

Here we propose a two-stage algorithm to partially overcome these computational barriers. The central idea is to initially utilize existing reconstruction techniques on an individual chromosome basis, using intra-chromosomal contacts, and then to relatively position these chromosome-level reconstructions using inter-chromosomal contacts. This two-stage strategy represents a natural approach in view of the within- versus between- chromosome distribution of contacts. It can increase resolution ≈ 20 fold for mouse and human. After describing the algorithm we present 3D architectures for mouse embryonic stem cells and human lymphoblastoid cells. We evaluate the impact of several factors on reconstruction reproducibility and explore a variety of sampling schemes. We further analyze replicate data at differing resolutions obtained from recently devised in situ Hi-C assays. In all instances we demonstrate insensitivity of the whole-genome 3D reconstruction obtained by the two-stage algorithm to the sampling strategy used.

CONCLUSIONS

Our two-stage algorithm has the potential to significantly increase the resolution of 3D genome reconstructions. The improvements are such that we can progress from 1 Mb resolution to 100 kb resolution, notable since this latter value has been identified as critical to inferring topological domains in analyses performed on the contact (rather than 3D) level.

摘要

背景

真核细胞核内染色体的三维(3D)构型是多种细胞功能的重要因素,包括基因表达调控,并且还与致癌易位事件相关。尽管这种结构的可视化仍局限于低分辨率,但最近设计的染色体构象捕获技术使得能够在不断提高的分辨率下推断结构。特别是,当与下一代测序相结合时,这些方法可生成全基因组染色质接触或相互作用的清单。已经提出了各种算法来处理此类接触数据以生成重建的3D构型。研究表明,这些重建在下游生物学见解方面相对于原始相互作用数据可以提供附加价值。然而,由于计算瓶颈,仅实现了对哺乳动物的有限的、低分辨率重建。

结果

在此,我们提出一种两阶段算法以部分克服这些计算障碍。核心思想是首先基于单个染色体利用现有的重建技术,使用染色体内接触,然后利用染色体间接触相对定位这些染色体水平的重建。鉴于接触的染色体内与染色体间分布,这种两阶段策略代表了一种自然的方法。它可以将小鼠和人类的分辨率提高约20倍。在描述该算法之后,我们展示了小鼠胚胎干细胞和人类淋巴母细胞的3D结构。我们评估了几个因素对重建可重复性的影响,并探索了各种采样方案。我们进一步分析了从最近设计的原位Hi-C测定获得的不同分辨率的重复数据。在所有情况下,我们都证明了通过两阶段算法获得的全基因组3D重建对所使用的采样策略不敏感。

结论

我们的两阶段算法有可能显著提高3D基因组重建的分辨率。改进之处在于我们可以从1 Mb分辨率提高到100 kb分辨率,这很显著,因为后一个值已被确定为在接触(而非3D)水平上进行分析以推断拓扑结构域的关键。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f23/4638111/142fc0ce7489/12859_2015_799_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验