• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

InsuLock:一种用于准确预测绝缘子和变异影响量化的弱监督学习方法。

InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification.

机构信息

Computer Science Department, University of California, Irvine, CA 92697, USA.

Mathematical, Computational & Systems Biology, University of California, Irvine, CA 92697, USA.

出版信息

Genes (Basel). 2022 Mar 30;13(4):621. doi: 10.3390/genes13040621.

DOI:10.3390/genes13040621
PMID:35456427
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9026820/
Abstract

Mapping chromatin insulator loops is crucial to investigating genome evolution, elucidating critical biological functions, and ultimately quantifying variant impact in diseases. However, chromatin conformation profiling assays are usually expensive, time-consuming, and may report fuzzy insulator annotations with low resolution. Therefore, we propose a weakly supervised deep learning method, InsuLock, to address these challenges. Specifically, InsuLock first utilizes a Siamese neural network to predict the existence of insulators within a given region (up to 2000 bp). Then, it uses an object detection module for precise insulator boundary localization via gradient-weighted class activation mapping (~40 bp resolution). Finally, it quantifies variant impacts by comparing the insulator score differences between the wild-type and mutant alleles. We applied InsuLock on various bulk and single-cell datasets for performance testing and benchmarking. We showed that it outperformed existing methods with an AUROC of ~0.96 and condensed insulator annotations to ~2.5% of their original size while still demonstrating higher conservation scores and better motif enrichments. Finally, we utilized InsuLock to make cell-type-specific variant impacts from brain scATAC-seq data and identified a schizophrenia GWAS variant disrupting an insulator loop proximal to a known risk gene, indicating a possible new mechanism of action for the disease.

摘要

绘制染色质绝缘子环对于研究基因组进化、阐明关键生物学功能以及最终量化疾病中的变异影响至关重要。然而,染色质构象分析检测通常昂贵、耗时,并且可能会报告分辨率较低的模糊绝缘子注释。因此,我们提出了一种弱监督深度学习方法 InsuLock 来解决这些挑战。具体来说,InsuLock 首先利用孪生神经网络预测给定区域(长达 2000bp)内绝缘子的存在。然后,它使用目标检测模块通过梯度加权类激活映射(40bp 分辨率)进行精确的绝缘子边界定位。最后,它通过比较野生型和突变型等位基因之间的绝缘子得分差异来量化变异的影响。我们在各种批量和单细胞数据集上应用 InsuLock 进行性能测试和基准测试。结果表明,InsuLock 的表现优于现有方法,AUROC 约为 0.96,并将绝缘子注释压缩到原始大小的2.5%,同时仍然表现出更高的保守分数和更好的基序富集。最后,我们利用 InsuLock 从大脑 scATAC-seq 数据中进行细胞类型特异性变异影响分析,并鉴定出一个精神分裂症 GWAS 变异破坏了一个已知风险基因附近的绝缘子环,这表明了该疾病的一个新的可能作用机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/c0a6c1545861/genes-13-00621-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/5cf52bd98523/genes-13-00621-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/a835b8270b70/genes-13-00621-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/b43d7ad0fbf8/genes-13-00621-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/fd2d95727dba/genes-13-00621-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/6fa3e2aeafca/genes-13-00621-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/c0a6c1545861/genes-13-00621-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/5cf52bd98523/genes-13-00621-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/a835b8270b70/genes-13-00621-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/b43d7ad0fbf8/genes-13-00621-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/fd2d95727dba/genes-13-00621-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/6fa3e2aeafca/genes-13-00621-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8a5/9026820/c0a6c1545861/genes-13-00621-g006.jpg

相似文献

1
InsuLock: A Weakly Supervised Learning Approach for Accurate Insulator Prediction, and Variant Impact Quantification.InsuLock:一种用于准确预测绝缘子和变异影响量化的弱监督学习方法。
Genes (Basel). 2022 Mar 30;13(4):621. doi: 10.3390/genes13040621.
2
scEpiLock: A Weakly Supervised Learning Framework for -Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data.scEpiLock:单细胞表观遗传数据的 -调控元件定位和变体影响量化的弱监督学习框架。
Biomolecules. 2022 Jun 23;12(7):874. doi: 10.3390/biom12070874.
3
A sequence-based deep learning approach to predict CTCF-mediated chromatin loop.基于序列的深度学习方法预测 CTCF 介导的染色质环。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab031.
4
DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays.DECODE:一种利用大规模功能测定法浓缩增强子并精调边界的深度学习框架。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i280-i288. doi: 10.1093/bioinformatics/btab283.
5
The CCCTC Binding Factor, CTRL2, Modulates Heterochromatin Deposition and the Establishment of Herpes Simplex Virus 1 Latency .CCCTC 结合因子 CTRL2 调节异染色质沉积和单纯疱疹病毒 1 潜伏期的建立。
J Virol. 2019 Jun 14;93(13). doi: 10.1128/JVI.00415-19. Print 2019 Jul 1.
6
CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types.CLNN-loop:一种深度学习模型,用于预测不同细胞系中的 CTCF 介导的染色质环和 CTCF 结合位点 (CBS) 对类型。
Bioinformatics. 2022 Sep 30;38(19):4497-4504. doi: 10.1093/bioinformatics/btac575.
7
[Combinatorial CRISPR inversions of CTCF sites in cluster reveal complex insulator function].[簇中CTCF位点的组合CRISPR倒置揭示复杂的绝缘子功能]
Yi Chuan. 2021 Aug 20;43(8):758-774. doi: 10.16288/j.yczz.21-131.
8
Deep Learning of Sequence Patterns for CCCTC-Binding Factor-Mediated Chromatin Loop Formation.序列模式的深度学习在 CCCTC 结合因子介导的染色质环形成中的应用。
J Comput Biol. 2021 Feb;28(2):133-145. doi: 10.1089/cmb.2020.0225. Epub 2020 Nov 25.
9
CTCF Binding Sites in the Herpes Simplex Virus 1 Genome Display Site-Specific CTCF Occupation, Protein Recruitment, and Insulator Function.单纯疱疹病毒1型基因组中的CTCF结合位点表现出位点特异性的CTCF占据、蛋白质募集及绝缘子功能。
J Virol. 2018 Mar 28;92(8). doi: 10.1128/JVI.00156-18. Print 2018 Apr 15.
10
PARP1 Stabilizes CTCF Binding and Chromatin Structure To Maintain Epstein-Barr Virus Latency Type.PARP1 稳定 CTCF 结合和染色质结构以维持 Epstein-Barr 病毒潜伏期类型。
J Virol. 2018 Aug 29;92(18). doi: 10.1128/JVI.00755-18. Print 2018 Sep 15.

本文引用的文献

1
JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.JASPAR 2022:转录因子结合谱开放获取数据库的第 9 个版本。
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173. doi: 10.1093/nar/gkab1113.
2
Genome folding through loop extrusion by SMC complexes.通过 SMC 复合物的环伸出进行基因组折叠。
Nat Rev Mol Cell Biol. 2021 Jul;22(7):445-464. doi: 10.1038/s41580-021-00349-7. Epub 2021 Mar 25.
3
Comprehensive analysis of single cell ATAC-seq data with SnapATAC.利用 SnapATAC 对单细胞 ATAC-seq 数据进行全面分析。
Nat Commun. 2021 Feb 26;12(1):1337. doi: 10.1038/s41467-021-21583-9.
4
A sequence-based deep learning approach to predict CTCF-mediated chromatin loop.基于序列的深度学习方法预测 CTCF 介导的染色质环。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab031.
5
ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis.ArchR 是一个可扩展的软件包,用于整合单细胞染色质可及性分析。
Nat Genet. 2021 Mar;53(3):403-411. doi: 10.1038/s41588-021-00790-6. Epub 2021 Feb 25.
6
Deep Learning of Sequence Patterns for CCCTC-Binding Factor-Mediated Chromatin Loop Formation.序列模式的深度学习在 CCCTC 结合因子介导的染色质环形成中的应用。
J Comput Biol. 2021 Feb;28(2):133-145. doi: 10.1089/cmb.2020.0225. Epub 2020 Nov 25.
7
Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer's and Parkinson's diseases.单细胞表观基因组学分析提示阿尔茨海默病和帕金森病遗传风险位点的候选因果变异。
Nat Genet. 2020 Nov;52(11):1158-1168. doi: 10.1038/s41588-020-00721-x. Epub 2020 Oct 26.
8
Expanded encyclopaedias of DNA elements in the human and mouse genomes.人类和小鼠基因组中 DNA 元件的扩展百科全书。
Nature. 2020 Jul;583(7818):699-710. doi: 10.1038/s41586-020-2493-4. Epub 2020 Jul 29.
9
Landscape of cohesin-mediated chromatin loops in the human genome.人类基因组中黏合蛋白介导的染色质环景观。
Nature. 2020 Jul;583(7818):737-743. doi: 10.1038/s41586-020-2151-x. Epub 2020 Jul 29.
10
An integrative ENCODE resource for cancer genomics.癌症基因组学的综合 ENCODE 资源。
Nat Commun. 2020 Jul 29;11(1):3696. doi: 10.1038/s41467-020-14743-w.