Division of Bioinformatics, Department of Epidemiology and Biostatistics, UCSF, 16th Street, San Francisco, 94158, USA.
BMC Bioinformatics. 2018 May 30;19(1):196. doi: 10.1186/s12859-018-2214-2.
Three dimensional (3D) genome spatial organization is critical for numerous cellular functions, including transcription, while certain conformation-driven structural alterations are frequently oncogenic. Genome conformation had been difficult to elucidate but the advent chromatin conformation capture assays, notably Hi-C, has transformed understanding of chromatin architecture and yielded numerous biological insights. Although most of these findings have flowed from analysis of proximity data produced by these assays, added value in generating 3D reconstructions has been demonstrated, deriving, in part, from superposing genomic features on the reconstruction. However, advantages of 3D structure-based analyses are clearly conditional on the accuracy of the attendant reconstructions, which is difficult to assess. Proponents of competing reconstruction algorithms have evaluated their accuracy by recourse to simulation of toy structures and/or limited fluorescence in situ hybridization (FISH) imaging that features a handful of low resolution probes. Accordingly, new methods of reconstruction accuracy assessment are needed.
Here we utilize two recently devised assays to develop methodology for assessing 3D reconstruction accuracy. Multiplex FISH increases the number of probes by an order of magnitude and hence the number of inter-probe distances by two orders, providing sufficient information for structure-level evaluation via mean-squared deviations (MSD). Crucially, underscoring multiplex FISH applications are large numbers of coordinate-system aligned replicates that provide the basis for a referent distribution for MSD statistics. Using this system we show that reconstructions based on Hi-C data for IMR90 cells are accurate for some chromosomes but not others. The second new assay, genome architecture mapping, utilizes large numbers of thin cryosections to obtain a measure of proximity. We exploit the planarity of the cryosections - not used in inferring proximity - to obtain measures of reconstruction accuracy, with referents provided via resampling. Application to mouse embryonic stem cells shows reconstruction accuracies that vary by chromosome.
We have developed methods for assessing the accuracy of 3D genome reconstructions that exploit features of recently advanced multiplex FISH and genome architecture mapping assays. These approaches can help overcome the absence of gold standards for making such assessments which are important in view of the considerable uncertainties surrounding 3D genome reconstruction.
三维(3D)基因组空间组织对包括转录在内的许多细胞功能至关重要,而某些构象驱动的结构改变常常是致癌的。尽管染色质构象捕获测定,特别是 Hi-C,已经改变了对染色质结构的理解,并产生了许多生物学见解,但基因组构象一直难以阐明。虽然这些发现中的大多数都来自于对这些测定产生的接近数据的分析,但已经证明在生成 3D 重建方面具有附加价值,部分原因是将基因组特征叠加在重建上。然而,基于 3D 结构的分析的优势显然取决于伴随重建的准确性,而这是难以评估的。竞争重建算法的支持者通过模拟玩具结构和/或有限的荧光原位杂交(FISH)成像来评估其准确性,这些方法具有少数低分辨率探针。因此,需要新的重建准确性评估方法。
在这里,我们利用最近设计的两种测定方法来开发评估 3D 重建准确性的方法。多重 FISH 将探针数量增加了一个数量级,因此两个数量级的探针间距离增加了两个数量级,通过均方偏差(MSD)提供了结构级评估的足够信息。至关重要的是,多重 FISH 应用强调了大量的坐标系对齐重复,这些重复为 MSD 统计的参考分布提供了基础。使用该系统,我们表明,基于 IMR90 细胞的 Hi-C 数据的重建对于一些染色体是准确的,但对于其他染色体则不准确。第二个新的测定方法,基因组结构图谱绘制,利用大量的薄冷冻切片来获得接近度的测量。我们利用冷冻切片的平面性(未用于推断接近度)来获得重建准确性的测量值,并通过重采样提供参考值。将其应用于小鼠胚胎干细胞显示,染色体之间的重建准确性有所不同。
我们已经开发了评估 3D 基因组重建准确性的方法,这些方法利用了最近先进的多重 FISH 和基因组结构图谱绘制测定的特点。这些方法可以帮助克服缺乏用于进行此类评估的黄金标准的问题,鉴于 3D 基因组重建存在相当大的不确定性,这一点非常重要。