在生物医学文本中识别组蛋白修饰以支持表观基因组学研究。

Identification of histone modifications in biomedical text for supporting epigenomic research.

作者信息

Kolárik Corinna, Klinger Roman, Hofmann-Apitius Martin

机构信息

Department of Bioinformatics, Fraunhofer Institute Algorithms and Scientific Computing (SCAI) Schloss Birlinghoven, D-53754 Sankt Augustin, Germany.

出版信息

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S28. doi: 10.1186/1471-2105-10-S1-S28.

DOI:10.1186/1471-2105-10-S1-S28

PMID:19208128

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2648793/

Abstract

BACKGROUND

Posttranslational modifications of histones influence the structure of chromatine and in such a way take part in the regulation of gene expression. Certain histone modification patterns, distributed over the genome, are connected to cell as well as tissue differentiation and to the adaption of organisms to their environment. Abnormal changes instead influence the development of disease states like cancer. The regulation mechanisms for modifying histones and its functionalities are the subject of epigenomics investigation and are still not completely understood. Text provides a rich resource of knowledge on epigenomics and modifications of histones in particular. It contains information about experimental studies, the conditions used, and results. To our knowledge, no approach has been published so far for identifying histone modifications in text.

RESULTS

We have developed an approach for identifying histone modifications in biomedical literature with Conditional Random Fields (CRF) and for resolving the recognized histone modification term variants by term standardization. For the term identification F1 measures of 0.84 by 10-fold cross-validation on the training corpus and 0.81 on an independent test corpus have been obtained. The standardization enabled the correct transformation of 96% of the terms from training and 98% from test the corpus. Due to the lack of terminologies exhaustively covering specific histone modification types, we developed a histone modification term hierarchy for use in a semantic text retrieval system.

CONCLUSION

The developed approach highly improves the retrieval of articles describing histone modifications. Since text contains context information about performed studies and experiments, the identification of histone modifications is the basis for supporting literature-based knowledge discovery and hypothesis generation to accelerate epigenomic research.

摘要

背景

组蛋白的翻译后修饰会影响染色质结构，从而参与基因表达的调控。某些分布于基因组的组蛋白修饰模式与细胞及组织分化以及生物体对环境的适应性相关。相反，异常变化会影响诸如癌症等疾病状态的发展。组蛋白修饰的调控机制及其功能是表观基因组学研究的主题，目前仍未完全明晰。文本提供了关于表观基因组学，尤其是组蛋白修饰的丰富知识资源。它包含有关实验研究、所用条件及结果的信息。据我们所知，目前尚未有在文本中识别组蛋白修饰的方法被发表。

结果

我们开发了一种利用条件随机场（CRF）在生物医学文献中识别组蛋白修饰，并通过术语标准化解决已识别的组蛋白修饰术语变体的方法。在训练语料库上通过10折交叉验证获得的术语识别F1值为0.84，在独立测试语料库上为0.81。标准化使得训练语料库中96%的术语以及测试语料库中98%的术语能够正确转换。由于缺乏详尽涵盖特定组蛋白修饰类型的术语表，我们开发了一个组蛋白修饰术语层次结构，用于语义文本检索系统。

结论

所开发的方法极大地改进了描述组蛋白修饰的文章的检索。由于文本包含有关所进行研究和实验的上下文信息，组蛋白修饰的识别是支持基于文献的知识发现和假设生成以加速表观基因组学研究的基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1bb/2648793/b9fbf3d1e70b/1471-2105-10-S1-S28-1.jpg

相似文献

Identification of histone modifications in biomedical text for supporting epigenomic research.

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S28. doi: 10.1186/1471-2105-10-S1-S28.

CHHM: a Manually Curated Catalogue of Human Histone Modifications Revealing Hotspot Regions and Unique Distribution Patterns.

Int J Biol Sci. 2024 Jul 2;20(10):3760-3772. doi: 10.7150/ijbs.95954. eCollection 2024.

[The significance of histone modifications in malignant transformation].

Postepy Biochem. 2012;58(3):292-301.

Histone onco-modifications.

Oncogene. 2011 Aug 4;30(31):3391-403. doi: 10.1038/onc.2011.121. Epub 2011 Apr 25.

ICGEC: a comparative method for measuring epigenetic conservation of genes via the integrated signal from multiple histone modifications between cell types.

BMC Genomics. 2020 May 12;21(1):356. doi: 10.1186/s12864-020-6771-1.

Development of live-cell imaging probes for monitoring histone modifications.

Bioorg Med Chem. 2012 Mar 15;20(6):1887-92. doi: 10.1016/j.bmc.2012.01.018. Epub 2012 Jan 21.

Post-translational modifications of the linker histone variants and their association with cell mechanisms.

FEBS J. 2009 Jul;276(14):3685-97. doi: 10.1111/j.1742-4658.2009.07079.x. Epub 2009 May 28.

Physicochemical modifications of histones and their impact on epigenomics.

Drug Discov Today. 2014 Sep;19(9):1372-9. doi: 10.1016/j.drudis.2014.05.005. Epub 2014 May 20.

Metabolic regulation of histone post-translational modifications.

ACS Chem Biol. 2015 Jan 16;10(1):95-108. doi: 10.1021/cb500846u.

Transgenerational inheritance: how impacts to the epigenetic and genetic information of parents affect offspring health.

Hum Reprod Update. 2019 Sep 11;25(5):518-540. doi: 10.1093/humupd/dmz017.

引用本文的文献

GeneView: a comprehensive semantic search engine for PubMed.

Nucleic Acids Res. 2012 Jul;40(Web Server issue):W585-91. doi: 10.1093/nar/gks563. Epub 2012 Jun 12.

The Histone Database: an integrated resource for histones and histone fold-containing proteins.

Database (Oxford). 2011 Oct 23;2011:bar048. doi: 10.1093/database/bar048. Print 2011.

本文引用的文献

Detection of IUPAC and IUPAC-like chemical names.

Bioinformatics. 2008 Jul 1;24(13):i268-76. doi: 10.1093/bioinformatics/btn181.

Knowledge environments representing molecular entities for the virtual physiological human.

Philos Trans A Math Phys Eng Sci. 2008 Sep 13;366(1878):3091-110. doi: 10.1098/rsta.2008.0099.

The UCSC Genome Browser.

Curr Protoc Bioinformatics. 2007 Mar;Chapter 1:Unit 1.4. doi: 10.1002/0471250953.bi0104s17.

Identifying gene-specific variations in biomedical text.

J Bioinform Comput Biol. 2007 Dec;5(6):1277-96. doi: 10.1142/s0219720007003156.

How chromatin-binding modules interpret histone modifications: lessons from professional pocket pickers.

Nat Struct Mol Biol. 2007 Nov;14(11):1025-1040. doi: 10.1038/nsmb1338. Epub 2007 Nov 5.

Cross-regulation of histone modifications.

Nat Struct Mol Biol. 2007 Nov;14(11):1017-24. doi: 10.1038/nsmb1307. Epub 2007 Nov 5.

PubMeth: a cancer methylation database combining text-mining and expert annotation.

Nucleic Acids Res. 2008 Jan;36(Database issue):D842-6. doi: 10.1093/nar/gkm788. Epub 2007 Oct 11.

A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data.

Bioinformatics. 2007 Nov 15;23(22):3080-7. doi: 10.1093/bioinformatics/btm461. Epub 2007 Sep 19.

The dynamic epigenome and its implications in toxicology.

Toxicol Sci. 2007 Nov;100(1):7-23. doi: 10.1093/toxsci/kfm177. Epub 2007 Aug 3.

Epigenetic reprogramming and imprinting in origins of disease.

Rev Endocr Metab Disord. 2007 Jun;8(2):173-82. doi: 10.1007/s11154-007-9042-4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在生物医学文本中识别组蛋白修饰以支持表观基因组学研究。

Identification of histone modifications in biomedical text for supporting epigenomic research.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献