Suppr超能文献

机器学习揭示了人类三维染色质接触模式的多样性。

Machine Learning Reveals the Diversity of Human 3D Chromatin Contact Patterns.

机构信息

Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA, USA.

Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA.

出版信息

Mol Biol Evol. 2024 Oct 4;41(10). doi: 10.1093/molbev/msae209.

Abstract

Understanding variation in chromatin contact patterns across diverse humans is critical for interpreting noncoding variants and their effects on gene expression and phenotypes. However, experimental determination of chromatin contact patterns across large samples is prohibitively expensive. To overcome this challenge, we develop and validate a machine learning method to quantify the variation in 3D chromatin contacts at 2 kilobase resolution from genome sequence alone. We apply this approach to thousands of human genomes from the 1000 Genomes Project and the inferred hominin ancestral genome. While patterns of 3D contact divergence genome wide are qualitatively similar to patterns of sequence divergence, we find substantial differences in 3D divergence and sequence divergence in local 1 megabase genomic windows. In particular, we identify 392 windows with significantly greater 3D divergence than expected from sequence. Moreover, for 31% of genomic windows, a single individual has a rare divergent 3D contact map pattern. Using in silico mutagenesis, we find that most single nucleotide sequence changes do not result in changes to 3D chromatin contacts. However, in windows with substantial 3D divergence just one or a few variants can lead to divergent 3D chromatin contacts without the individuals carrying those variants having high sequence divergence. In summary, inferring 3D chromatin contact maps across human populations reveals variable contact patterns. We anticipate that these genetically diverse maps of 3D chromatin contact will provide a reference for future work on the function and evolution of 3D chromatin contact variation across human populations.

摘要

理解不同人类群体中染色质接触模式的变化对于解释非编码变异及其对基因表达和表型的影响至关重要。然而,在大样本中实验确定染色质接触模式的成本非常高。为了克服这一挑战,我们开发并验证了一种机器学习方法,仅从基因组序列即可定量确定 2kb 分辨率的 3D 染色质接触的变化。我们将这种方法应用于来自 1000 基因组计划和推断的人类祖先基因组的数千个人类基因组。虽然全基因组范围内 3D 接触发散的模式与序列发散的模式在质上相似,但我们发现局部 100 万碱基基因组窗口中 3D 发散和序列发散存在实质性差异。特别是,我们鉴定了 392 个具有明显更大 3D 发散的窗口,超过了从序列推断的预期。此外,对于 31%的基因组窗口,单个个体具有罕见的发散 3D 接触图谱模式。通过计算机模拟诱变,我们发现大多数单核苷酸序列变化不会导致 3D 染色质接触的变化。然而,在具有大量 3D 发散的窗口中,仅一个或几个变体就可以导致发散的 3D 染色质接触,而携带这些变体的个体没有高序列发散。总之,推断人类群体中的 3D 染色质接触图谱揭示了可变的接触模式。我们预计,这些具有遗传多样性的 3D 染色质接触图谱将为未来研究人类群体中 3D 染色质接触的功能和进化提供参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ff1/11523124/2fcf94ecd306/msae209f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验