• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ExTRI:从文献中提取转录调控相互作用

ExTRI: Extraction of transcription regulation interactions from literature.

作者信息

Vazquez Miguel, Krallinger Martin, Leitner Florian, Kuiper Martin, Valencia Alfonso, Laegreid Astrid

机构信息

Barcelona Supercomputing Center, Barcelona, Spain.

Barcelona Supercomputing Center, Barcelona, Spain.

出版信息

Biochim Biophys Acta Gene Regul Mech. 2022 Jan;1865(1):194778. doi: 10.1016/j.bbagrm.2021.194778. Epub 2021 Dec 5.

DOI:10.1016/j.bbagrm.2021.194778
PMID:34875418
Abstract

The regulation of gene transcription by transcription factors is a fundamental biological process, yet the relations between transcription factors (TF) and their target genes (TG) are still only sparsely covered in databases. Text-mining tools can offer broad and complementary solutions to help locate and extract mentions of these biological relationships in articles. We have generated ExTRI, a knowledge graph of TF-TG relationships, by applying a high recall text-mining pipeline to MedLine abstracts identifying over 100,000 candidate sentences with TF-TG relations. Validation procedures indicated that about half of the candidate sentences contain true TF-TG relationships. Post-processing identified 53,000 high confidence sentences containing TF-TG relationships, with a cross-validation F1-score close to 75%. The resulting collection of TF-TG relationships covers 80% of the relations annotated in existing databases. It adds 11,000 other potential interactions, including relationships for ~100 TFs currently not in public TF-TG relation databases. The high confidence abstract sentences contribute 25,000 literature references not available from other resources and offer a wealth of direct pointers to functional aspects of the TF-TG interactions. Our compiled resource encompassing ExTRI together with publicly available resources delivers literature-derived TF-TG interactions for more than 900 of the 1500-1600 proteins considered to function as specific DNA binding TFs. The obtained result can be used by curators, for network analysis and modelling, for causal reasoning or knowledge graph mining approaches, or serve to benchmark text mining strategies.

摘要

转录因子对基因转录的调控是一个基本的生物学过程,然而转录因子(TF)与其靶基因(TG)之间的关系在数据库中仍然鲜有涉及。文本挖掘工具可以提供广泛且互补的解决方案,以帮助在文章中定位和提取这些生物学关系的提及。我们通过对MedLine摘要应用高召回率的文本挖掘管道,生成了一个TF-TG关系的知识图谱,识别出超过100,000个具有TF-TG关系的候选句子。验证程序表明,约一半的候选句子包含真实的TF-TG关系。后处理确定了53,000个包含TF-TG关系的高置信度句子,交叉验证F1分数接近75%。由此产生的TF-TG关系集合涵盖了现有数据库中注释关系的80%。它还增加了11,000个其他潜在的相互作用,包括目前不在公共TF-TG关系数据库中的约100个TF的关系。高置信度的摘要句子提供了25,000个其他资源中没有的文献参考,并为TF-TG相互作用的功能方面提供了丰富的直接线索。我们编译的资源包括ExTRI以及公开可用的资源,为1500 - 1600个被认为具有特定DNA结合功能的TF蛋白中的900多个提供了文献衍生的TF-TG相互作用。所获得的结果可供策展人用于网络分析和建模、因果推理或知识图谱挖掘方法,或用于基准测试文本挖掘策略。

相似文献

1
ExTRI: Extraction of transcription regulation interactions from literature.ExTRI:从文献中提取转录调控相互作用
Biochim Biophys Acta Gene Regul Mech. 2022 Jan;1865(1):194778. doi: 10.1016/j.bbagrm.2021.194778. Epub 2021 Dec 5.
2
Finding biomarkers in non-model species: literature mining of transcription factors involved in bovine embryo development.在非模式物种中寻找生物标志物:涉及牛胚胎发育的转录因子的文献挖掘。
BioData Min. 2012 Aug 29;5(1):12. doi: 10.1186/1756-0381-5-12.
3
miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases.miRiaD:一种用于检测微小RNA与疾病关联的文本挖掘工具。
J Biomed Semantics. 2016 Apr 29;7(1):9. doi: 10.1186/s13326-015-0044-y.
4
Latent Semantic Indexing of PubMed abstracts for identification of transcription factor candidates from microarray derived gene sets.基于PubMed 摘要的潜在语义索引从微阵列基因集中识别转录因子候选物。
BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S19. doi: 10.1186/1471-2105-12-S10-S19.
5
A systems biology approach to the global analysis of transcription factors in colorectal cancer.系统生物学方法分析结直肠癌中的转录因子。
BMC Cancer. 2012 Aug 1;12:331. doi: 10.1186/1471-2407-12-331.
6
ModEx: A text mining system for extracting mode of regulation of transcription factor-gene regulatory interaction.ModEx:一种用于提取转录因子-基因调控相互作用的调控模式的文本挖掘系统。
J Biomed Inform. 2020 Feb;102:103353. doi: 10.1016/j.jbi.2019.103353. Epub 2019 Dec 16.
7
ORTI: An Open-Access Repository of Transcriptional Interactions for Interrogating Mammalian Gene Expression Data.ORTI:用于探究哺乳动物基因表达数据的转录相互作用开放获取知识库。
PLoS One. 2016 Oct 10;11(10):e0164535. doi: 10.1371/journal.pone.0164535. eCollection 2016.
8
Multiple independent analyses reveal only transcription factors as an enriched functional class associated with microRNAs.多项独立分析表明,只有转录因子是与微小RNA相关的一个富集功能类别。
BMC Syst Biol. 2012 Jul 23;6:90. doi: 10.1186/1752-0509-6-90.
9
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.通过整合多源生物数据基于网络基序识别转录因子-靶基因关系
BMC Bioinformatics. 2008 Apr 21;9:203. doi: 10.1186/1471-2105-9-203.
10
Computer-assisted curation of a human regulatory core network from the biological literature.通过生物学文献对人类调控核心网络进行计算机辅助编目。
Bioinformatics. 2015 Apr 15;31(8):1258-66. doi: 10.1093/bioinformatics/btu795. Epub 2014 Nov 29.

引用本文的文献

1
Multiomics Signature Reveals Network Regulatory Mechanisms in a CRC Continuum.多组学特征揭示了结直肠癌连续体中的网络调控机制。
Int J Mol Sci. 2025 Jul 23;26(15):7077. doi: 10.3390/ijms26157077.
2
BioGAN: Enhancing Transcriptomic Data Generation with Biological Knowledge.生物生成对抗网络(BioGAN):利用生物学知识增强转录组数据生成
Bioengineering (Basel). 2025 Jun 16;12(6):658. doi: 10.3390/bioengineering12060658.
3
Dominant interfering CARD11 variants disrupt JNK signaling to promote GATA3 expression in T cells.显性干扰性CARD11变体破坏JNK信号传导,以促进T细胞中GATA3的表达。
J Exp Med. 2025 Jun 2;222(6). doi: 10.1084/jem.20240272. Epub 2025 Mar 20.
4
Specifying cellular context of transcription factor regulons for exploring context-specific gene regulation programs.确定转录因子调控子的细胞背景以探索背景特异性基因调控程序。
NAR Genom Bioinform. 2025 Jan 7;7(1):lqae178. doi: 10.1093/nargab/lqae178. eCollection 2025 Mar.
5
Integration of chromosome locations and functional aspects of enhancers and topologically associating domains in knowledge graphs enables versatile queries about gene regulation.整合增强子和拓扑关联域的染色体位置和功能方面,使知识图谱能够灵活地查询基因调控。
Nucleic Acids Res. 2024 Aug 27;52(15):e69. doi: 10.1093/nar/gkae566.
6
Specifying cellular context of transcription factor regulons for exploring context-specific gene regulation programs.指定转录因子调控子的细胞背景以探索背景特异性基因调控程序。
bioRxiv. 2024 Jan 1:2023.12.31.573765. doi: 10.1101/2023.12.31.573765.
7
Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical-protein relations.DrugProt 任务概述在 BioCreative VII 上:大规模文本挖掘和异构化学-蛋白质关系知识图生成的数据和方法。
Database (Oxford). 2023 Nov 28;2023. doi: 10.1093/database/baad080.
8
Expanding the coverage of regulons from high-confidence prior knowledge for accurate estimation of transcription factor activities.从高可信度的先验知识中扩展调控网络的覆盖范围,以准确估计转录因子的活性。
Nucleic Acids Res. 2023 Nov 10;51(20):10934-10949. doi: 10.1093/nar/gkad841.
9
SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update.SIGNOR 3.0,即信号网络开放资源 3.0:2022 年更新版。
Nucleic Acids Res. 2023 Jan 6;51(D1):D631-D637. doi: 10.1093/nar/gkac883.
10
TFLink: an integrated gateway to access transcription factor-target gene interactions for multiple species.TFLink:一个集成的网关,用于访问多个物种的转录因子-靶基因相互作用。
Database (Oxford). 2022 Sep 16;2022. doi: 10.1093/database/baac083.