Suppr超能文献

HICANCER:使用 Hi-C 读取实现准确和完整的癌症基因组相位。

HICANCER: accurate and complete cancer genome phasing with Hi-C reads.

机构信息

Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.

出版信息

Sci Rep. 2021 Mar 23;11(1):6609. doi: 10.1038/s41598-021-86104-6.

Abstract

Due to the high complexity of cancer genome, it is too difficult to generate complete cancer genome map which contains the sequence of every DNA molecule until now. Nevertheless, phasing each chromosome in cancer genome into two haplotypes according to germline mutations provides a suboptimal solution to understand cancer genome. However, phasing cancer genome is also a challenging problem, due to the limit in experimental and computational technologies. Hi-C data is widely used in phasing in recent years due to its long-range linkage information and provides an opportunity for solving the problem of phasing cancer genome. The existing Hi-C based phasing methods can not be applied to cancer genome directly, because the somatic mutations in cancer genome such as somatic SNPs, copy number variations and structural variations greatly reduce the correctness and completeness. Here, we propose a new Hi-C based pipeline for phasing cancer genome called HICANCER. HICANCER solves different kinds of somatic mutations and variations, and take advantage of allelic copy number imbalance and linkage disequilibrium to improve the correctness and completeness of phasing. According to our experiments in K562 and KBM-7 cell lines, HICANCER is able to generate very high-quality chromosome-level haplotypes for cancer genome with only Hi-C data.

摘要

由于癌症基因组的高度复杂性,直到现在,要生成包含每个 DNA 分子序列的完整癌症基因组图谱仍然非常困难。然而,根据种系突变将癌症基因组中的每条染色体分为两个单倍型,为理解癌症基因组提供了一个次优的解决方案。然而,由于实验和计算技术的限制,对癌症基因组进行相位划分也是一个具有挑战性的问题。近年来,Hi-C 数据因其长程连锁信息而被广泛应用于相位划分,并为解决癌症基因组相位划分问题提供了机会。现有的基于 Hi-C 的相位划分方法不能直接应用于癌症基因组,因为癌症基因组中的体细胞突变,如体细胞 SNP、拷贝数变异和结构变异,大大降低了相位划分的正确性和完整性。在这里,我们提出了一种名为 HICANCER 的新的基于 Hi-C 的癌症基因组相位划分流水线。HICANCER 解决了不同类型的体细胞突变和变异,并利用等位基因拷贝数不平衡和连锁不平衡来提高相位划分的正确性和完整性。根据我们在 K562 和 KBM-7 细胞系中的实验,HICANCER 仅使用 Hi-C 数据就能为癌症基因组生成非常高质量的染色体水平单倍型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e0/7987978/8d102568311c/41598_2021_86104_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验