Suppr超能文献

改变生物研究方式:表型数据模型和知识库。

Transforming the study of organisms: Phenomic data models and knowledge bases.

机构信息

Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon, United States of America.

Ronin Institute for Independent Scholarship, Monclair, New Jersey, United States of America.

出版信息

PLoS Comput Biol. 2020 Nov 24;16(11):e1008376. doi: 10.1371/journal.pcbi.1008376. eCollection 2020 Nov.

Abstract

The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem.

摘要

基因测序成本的迅速降低导致了来自生命之树各个分支的基因组数据泛滥;然而,除了少数几个模式生物数据库之外,由于没有可计算的表型数据伴随,基因组数据在科学上的影响力有限。大多数表型数据包含在无数的小型、异构的表型数据集,由于格式多变、缺乏数字化和语言问题,这些数据集非常难以或不可能大规模整合。一个有力的解决方案是使用具有精确、可计算语义的数据模型来表示表型数据,但表型数据表示的语义标准的采用一直很缓慢,特别是在生物多样性和生态学领域。一些表型和特征数据可从知识库中以语义语言获得,但这些数据通常不具有互操作性。在这篇综述中,我们将比较和对比现有的本体和数据模型,重点关注非人类的表型和特征。我们讨论了整合表型数据的障碍,并为开发一个可操作的、语义互操作的表型数据生态系统提出了建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8766/7685442/4a3bf9d1288b/pcbi.1008376.g001.jpg

相似文献

1
Transforming the study of organisms: Phenomic data models and knowledge bases.
PLoS Comput Biol. 2020 Nov 24;16(11):e1008376. doi: 10.1371/journal.pcbi.1008376. eCollection 2020 Nov.
2
Identifying disease-causal genes using Semantic Web-based representation of integrated genomic and phenomic knowledge.
J Biomed Inform. 2008 Oct;41(5):717-29. doi: 10.1016/j.jbi.2008.07.004. Epub 2008 Aug 23.
3
KaBOB: ontology-based semantic integration of biomedical databases.
BMC Bioinformatics. 2015 Apr 23;16(1):126. doi: 10.1186/s12859-015-0559-3.
4
Computational approaches to phenotyping: high-throughput phenomics.
Proc Am Thorac Soc. 2007 Jan;4(1):18-25. doi: 10.1513/pats.200607-142JG.
6
The performance of phenomic selection depends on the genetic architecture of the target trait.
Theor Appl Genet. 2022 Feb;135(2):653-665. doi: 10.1007/s00122-021-03997-7. Epub 2021 Nov 22.
7
Finding our way through phenotypes.
PLoS Biol. 2015 Jan 6;13(1):e1002033. doi: 10.1371/journal.pbio.1002033. eCollection 2015 Jan.
8
Next-generation phenomics for the Tree of Life.
PLoS Curr. 2013 Jun 26;5:ecurrents.tol.085c713acafc8711b2ff7010a4b03733. doi: 10.1371/currents.tol.085c713acafc8711b2ff7010a4b03733.
10
Phenomic selection in wheat breeding: prediction of the genotype-by-environment interaction in multi-environment breeding trials.
Theor Appl Genet. 2022 Oct;135(10):3337-3356. doi: 10.1007/s00122-022-04170-4. Epub 2022 Aug 8.

引用本文的文献

1
An open source knowledge graph ecosystem for the life sciences.
Sci Data. 2024 Apr 11;11(1):363. doi: 10.1038/s41597-024-03171-w.
2
The Ontology of Biological Attributes (OBA)-computational traits for the life sciences.
Mamm Genome. 2023 Sep;34(3):364-378. doi: 10.1007/s00335-023-09992-1. Epub 2023 Apr 19.
3
The Ontology of Biological Attributes (OBA) - Computational Traits for the Life Sciences.
bioRxiv. 2023 Jan 27:2023.01.26.525742. doi: 10.1101/2023.01.26.525742.
4
Advanced prokaryotic systematics: the modern face of an ancient science.
New Microbes New Infect. 2022 Nov 11;49-50:101036. doi: 10.1016/j.nmni.2022.101036. eCollection 2022 Nov-Dec.
5
Pleiotropy data resource as a primer for investigating co-morbidities/multi-morbidities and their role in disease.
Mamm Genome. 2022 Mar;33(1):135-142. doi: 10.1007/s00335-021-09917-w. Epub 2021 Sep 15.

本文引用的文献

1
The linguistic problem of morphology: structure versus homology and the standardization of morphological data.
Cladistics. 2010 Jun;26(3):301-325. doi: 10.1111/j.1096-0031.2009.00286.x. Epub 2009 Oct 7.
3
Knowledge-Based Biomedical Data Science.
Annu Rev Biomed Data Sci. 2020 Jul;3:23-41. doi: 10.1146/annurev-biodatasci-010820-091627. Epub 2020 Apr 7.
4
Journal of Open Source Software (JOSS): design and first-year review.
PeerJ Prepr. 2018;4:e147. doi: 10.7717/peerj-cs.147. Epub 2018 Feb 12.
5
Open Science principles for accelerating trait-based science across the Tree of Life.
Nat Ecol Evol. 2020 Mar;4(3):294-303. doi: 10.1038/s41559-020-1109-6. Epub 2020 Feb 17.
6
Deep learning in clinical natural language processing: a methodical review.
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
8
GWAS Atlas: a curated resource of genome-wide variant-trait associations in plants and animals.
Nucleic Acids Res. 2020 Jan 8;48(D1):D927-D932. doi: 10.1093/nar/gkz828.
9
Phenotype annotation with the ontology of microbial phenotypes (OMP).
J Biomed Semantics. 2019 Jul 15;10(1):13. doi: 10.1186/s13326-019-0205-5.
10
Organizing phenotypic data-a semantic data model for anatomy.
J Biomed Semantics. 2019 Jun 20;10(1):12. doi: 10.1186/s13326-019-0204-6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验