• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

启动子处转录因子结合和表观基因组的模式使得非编码基因和编码基因的多种功能具有可解释的可预测性。

Patterns of transcription factor binding and epigenome at promoters allow interpretable predictability of multiple functions of non-coding and coding genes.

作者信息

Chandra Omkar, Sharma Madhu, Pandey Neetesh, Jha Indra Prakash, Mishra Shreya, Kong Say Li, Kumar Vibhor

机构信息

Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Ph-III, New Delhi, India.

Genome Institute of Singapore, Agency for Science Technology and Research, Singapore, Singapore.

出版信息

Comput Struct Biotechnol J. 2023 Jul 14;21:3590-3603. doi: 10.1016/j.csbj.2023.07.014. eCollection 2023.

DOI:10.1016/j.csbj.2023.07.014
PMID:37520281
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10371796/
Abstract

Understanding the biological roles of all genes only through experimental methods is challenging. A computational approach with reliable interpretability is needed to infer the function of genes, particularly for non-coding RNAs. We have analyzed genomic features that are present across both coding and non-coding genes like transcription factor (TF) and cofactor ChIP-seq (823), histone modifications ChIP-seq (n = 621), cap analysis gene expression (CAGE) tags (n = 255), and DNase hypersensitivity profiles (n = 255) to predict ontology-based functions of genes. Our approach for gene function prediction was reliable (>90% balanced accuracy) for 486 gene-sets. PubMed abstract mining and CRISPR screens supported the inferred association of genes with biological functions, for which our method had high accuracy. Further analysis revealed that TF-binding patterns at promoters have high predictive strength for multiple functions. TF-binding patterns at the promoter add an unexplored dimension of explainable regulatory aspects of genes and their functions. Therefore, we performed a comprehensive analysis for the functional-specificity of TF-binding patterns at promoters and used them for clustering functions to reveal many latent groups of gene-sets involved in common major cellular processes. We also showed how our approach could be used to infer the functions of non-coding genes using the CRISPR screens of coding genes, which were validated using a long non-coding RNA CRISPR screen. Thus our results demonstrated the generality of our approach by using gene-sets from CRISPR screens. Overall, our approach opens an avenue for predicting the involvement of non-coding genes in various functions.

摘要

仅通过实验方法来理解所有基因的生物学作用具有挑战性。需要一种具有可靠可解释性的计算方法来推断基因的功能,特别是对于非编码RNA。我们分析了编码基因和非编码基因共有的基因组特征,如转录因子(TF)和辅因子ChIP-seq(823个)、组蛋白修饰ChIP-seq(n = 621)、帽分析基因表达(CAGE)标签(n = 255)以及DNase超敏反应图谱(n = 255),以预测基于本体的基因功能。我们的基因功能预测方法对于486个基因集是可靠的(平衡准确率>90%)。PubMed摘要挖掘和CRISPR筛选支持了推断出的基因与生物学功能的关联,我们的方法在这方面具有很高的准确性。进一步分析表明,启动子处的TF结合模式对多种功能具有很高的预测强度。启动子处的TF结合模式为基因及其功能的可解释调控方面增添了一个未被探索的维度。因此,我们对启动子处TF结合模式的功能特异性进行了全面分析,并将其用于功能聚类,以揭示参与常见主要细胞过程的许多潜在基因集组。我们还展示了如何使用编码基因的CRISPR筛选来推断非编码基因的功能,这通过长链非编码RNA CRISPR筛选得到了验证。因此,我们的结果通过使用CRISPR筛选中的基因集证明了我们方法的通用性。总体而言,我们的方法为预测非编码基因参与各种功能开辟了一条途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/f973dcda012b/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/2094d1d51f6a/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/46804005708d/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/1d5b3e1fd15a/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/4bdddf18703b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/74e48cab80c0/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/beac3f0fdbfd/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/f973dcda012b/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/2094d1d51f6a/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/46804005708d/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/1d5b3e1fd15a/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/4bdddf18703b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/74e48cab80c0/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/beac3f0fdbfd/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb8a/10371796/f973dcda012b/gr6.jpg

相似文献

1
Patterns of transcription factor binding and epigenome at promoters allow interpretable predictability of multiple functions of non-coding and coding genes.启动子处转录因子结合和表观基因组的模式使得非编码基因和编码基因的多种功能具有可解释的可预测性。
Comput Struct Biotechnol J. 2023 Jul 14;21:3590-3603. doi: 10.1016/j.csbj.2023.07.014. eCollection 2023.
2
Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes.启动子分析揭示了人类长链非编码RNA和蛋白质编码基因在整体上的差异调控。
PLoS One. 2014 Oct 2;9(10):e109443. doi: 10.1371/journal.pone.0109443. eCollection 2014.
3
Comprehensive Identification of Long Non-coding RNAs in Purified Cell Types from the Brain Reveals Functional LncRNA in OPC Fate Determination.对从大脑中纯化的细胞类型中的长链非编码RNA进行全面鉴定,揭示了少突胶质前体细胞命运决定中的功能性长链非编码RNA。
PLoS Genet. 2015 Dec 18;11(12):e1005669. doi: 10.1371/journal.pgen.1005669. eCollection 2015 Dec.
4
Annotation of gene promoters by integrative data-mining of ChIP-seq Pol-II enrichment data.通过整合 ChIP-seq Pol-II 富集数据的数据挖掘对基因启动子进行注释。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S65. doi: 10.1186/1471-2105-11-S1-S65.
5
Transcription factor binding profiles reveal cyclic expression of human protein-coding genes and non-coding RNAs.转录因子结合谱揭示了人类蛋白编码基因和非编码 RNA 的周期性表达。
PLoS Comput Biol. 2013;9(7):e1003132. doi: 10.1371/journal.pcbi.1003132. Epub 2013 Jul 11.
6
Genome-wide identification of long non-coding RNA targets of the tomato MADS box transcription factor RIN and function analysis.番茄 MADS 盒转录因子 RIN 的全基因组鉴定及其长非编码 RNA 靶基因的功能分析。
Ann Bot. 2019 Feb 15;123(3):469-482. doi: 10.1093/aob/mcy178.
7
LncSEA: a platform for long non-coding RNA related sets and enrichment analysis.LncSEA:一个长非编码 RNA 相关集和富集分析的平台。
Nucleic Acids Res. 2021 Jan 8;49(D1):D969-D980. doi: 10.1093/nar/gkaa806.
8
Identification of regulatory regions of bidirectional genes in cervical cancer.鉴定宫颈癌中双向基因的调控区域。
BMC Med Genomics. 2013;6 Suppl 1(Suppl 1):S5. doi: 10.1186/1755-8794-6-S1-S5. Epub 2013 Jan 23.
9
Bacterial Transcription Factors Bind to Coding Regions and Regulate Internal Cryptic Promoters.细菌转录因子结合到编码区并调节内部隐匿启动子。
mBio. 2022 Oct 26;13(5):e0164322. doi: 10.1128/mbio.01643-22. Epub 2022 Oct 6.
10
FARNA: knowledgebase of inferred functions of non-coding RNA transcripts.FARNA:非编码RNA转录本推断功能知识库。
Nucleic Acids Res. 2017 Mar 17;45(5):2838-2848. doi: 10.1093/nar/gkw973.

本文引用的文献

1
Interpreting area under the receiver operating characteristic curve.解读受试者工作特征曲线下的面积
Lancet Digit Health. 2022 Dec;4(12):e853-e855. doi: 10.1016/S2589-7500(22)00188-1. Epub 2022 Oct 18.
2
Therapy-resistant and -sensitive lncRNAs, SNHG1 and UBL7-AS1 promote glioblastoma cell proliferation.耐药性和敏感性长链非编码 RNA,SNHG1 和 UBL7-AS1 促进脑胶质母细胞瘤细胞增殖。
Oxid Med Cell Longev. 2022 Mar 11;2022:2623599. doi: 10.1155/2022/2623599. eCollection 2022.
3
Multi-CUT&Tag to simultaneously profile multiple chromatin factors.
Multi-CUT&Tag 可同时分析多种染色质因子。
STAR Protoc. 2022 Jan 20;3(1):101100. doi: 10.1016/j.xpro.2021.101100. eCollection 2022 Mar 18.
4
MicroRNA profile as potential molecular signature for attention deficit hyperactivity disorder in children.微小RNA谱作为儿童注意力缺陷多动障碍的潜在分子标志物
Biomarkers. 2022 May;27(3):230-239. doi: 10.1080/1354750X.2021.2024600. Epub 2022 Feb 13.
5
Key features of the POU transcription factor Oct4 from an evolutionary perspective.从进化角度看 POU 转录因子 Oct4 的主要特征。
Cell Mol Life Sci. 2021 Dec;78(23):7339-7353. doi: 10.1007/s00018-021-03975-8. Epub 2021 Oct 26.
6
The SIX Family of Transcription Factors: Common Themes Integrating Developmental and Cancer Biology.转录因子的SIX家族:整合发育生物学和癌症生物学的共同主题
Front Cell Dev Biol. 2021 Aug 19;9:707854. doi: 10.3389/fcell.2021.707854. eCollection 2021.
7
CTCF and transcription influence chromatin structure re-configuration after mitosis.CTCF 和转录影响有丝分裂后染色质结构的重新配置。
Nat Commun. 2021 Aug 27;12(1):5157. doi: 10.1038/s41467-021-25418-5.
8
Global Chromatin Changes Resulting from Single-Gene Inactivation-The Role of SMARCB1 in Malignant Rhabdoid Tumor.单基因失活导致的全基因组染色质变化——SMARCB1在恶性横纹肌样瘤中的作用
Cancers (Basel). 2021 May 23;13(11):2561. doi: 10.3390/cancers13112561.
9
NetGO 2.0: improving large-scale protein function prediction with massive sequence, text, domain, family and network information.NetGO 2.0:利用大规模的序列、文本、结构域、家族和网络信息提高大规模蛋白质功能预测。
Nucleic Acids Res. 2021 Jul 2;49(W1):W469-W475. doi: 10.1093/nar/gkab398.
10
Co-occupancy identifies transcription factor co-operation for axon growth.共占位识别轴突生长中的转录因子协同作用。
Nat Commun. 2021 May 5;12(1):2555. doi: 10.1038/s41467-021-22828-3.