• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因型和表型数据在农业科学大数据时代的标准化、利用和整合。

Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences.

机构信息

Molecular and Digital Breeding, New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, 120 Mt Albert Road, Auckland 1025, New Zealand.

Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA.

出版信息

Database (Oxford). 2023 Dec 11;2023. doi: 10.1093/database/baad088.

DOI:10.1093/database/baad088
PMID:38079567
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10712715/
Abstract

Large-scale genotype and phenotype data have been increasingly generated to identify genetic markers, understand gene function and evolution and facilitate genomic selection. These datasets hold immense value for both current and future studies, as they are vital for crop breeding, yield improvement and overall agricultural sustainability. However, integrating these datasets from heterogeneous sources presents significant challenges and hinders their effective utilization. We established the Genotype-Phenotype Working Group in November 2021 as a part of the AgBioData Consortium (https://www.agbiodata.org) to review current data types and resources that support archiving, analysis and visualization of genotype and phenotype data to understand the needs and challenges of the plant genomic research community. For 2021-22, we identified different types of datasets and examined metadata annotations related to experimental design/methods/sample collection, etc. Furthermore, we thoroughly reviewed publicly funded repositories for raw and processed data as well as secondary databases and knowledgebases that enable the integration of heterogeneous data in the context of the genome browser, pathway networks and tissue-specific gene expression. Based on our survey, we recommend a need for (i) additional infrastructural support for archiving many new data types, (ii) development of community standards for data annotation and formatting, (iii) resources for biocuration and (iv) analysis and visualization tools to connect genotype data with phenotype data to enhance knowledge synthesis and to foster translational research. Although this paper only covers the data and resources relevant to the plant research community, we expect that similar issues and needs are shared by researchers working on animals. Database URL: https://www.agbiodata.org.

摘要

大规模的基因型和表型数据不断生成,以鉴定遗传标记、了解基因功能和进化,并促进基因组选择。这些数据集对于当前和未来的研究都具有巨大的价值,因为它们对于作物育种、产量提高和整体农业可持续性至关重要。然而,整合来自异构源的这些数据集面临着重大挑战,阻碍了它们的有效利用。我们于 2021 年 11 月成立了基因型-表型工作组,作为 AgBioData 联盟(https://www.agbiodata.org)的一部分,以审查当前支持基因型和表型数据存档、分析和可视化的数据类型和资源,以了解植物基因组研究界的需求和挑战。在 2021-22 年期间,我们确定了不同类型的数据集,并检查了与实验设计/方法/样本收集等相关的元数据注释。此外,我们还彻底审查了公共资助的原始和处理后数据存储库,以及二级数据库和知识库,这些资源使在基因组浏览器、途径网络和组织特异性基因表达的背景下整合异构数据成为可能。基于我们的调查,我们建议需要(i)为归档许多新数据类型提供额外的基础设施支持,(ii)制定数据注释和格式化的社区标准,(iii)生物信息资源,以及(iv)分析和可视化工具,以将基因型数据与表型数据连接起来,增强知识综合,并促进转化研究。尽管本文仅涵盖与植物研究界相关的数据和资源,但我们预计,从事动物研究的研究人员也存在类似的问题和需求。数据库 URL:https://www.agbiodata.org。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c36f/10712715/c71bda7f70e2/baad088f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c36f/10712715/c71bda7f70e2/baad088f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c36f/10712715/c71bda7f70e2/baad088f1.jpg

相似文献

1
Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences.基因型和表型数据在农业科学大数据时代的标准化、利用和整合。
Database (Oxford). 2023 Dec 11;2023. doi: 10.1093/database/baad088.
2
Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the Agbiodata Consortium.Agbiodata 联盟的农业遗传学、基因组学和育种数据库和资源的数据共享和本体使用。
Database (Oxford). 2023 Nov 15;2023. doi: 10.1093/database/baad076.
3
AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture.AgBioData 联盟关于农业可持续基因组学和遗传学数据库的建议。
Database (Oxford). 2018 Jan 1;2018:bay088. doi: 10.1093/database/bay088.
4
The Breeding Information Management System (BIMS): an online resource for crop breeding.作物育种信息管理系统(BIMS):一个在线作物育种资源库。
Database (Oxford). 2021 Aug 20;2021. doi: 10.1093/database/baab054.
5
Addition of a breeding database in the Genome Database for Rosaceae.在蔷薇科基因组数据库中添加一个繁殖数据库。
Database (Oxford). 2013 Nov 18;2013:bat078. doi: 10.1093/database/bat078. Print 2013.
6
Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.线粒体疾病序列数据资源(MSeqDR):一个全球基层联盟,旨在促进为线粒体疾病临床和研究群体进行基因组数据的提交、管理、注释及综合分析。
Mol Genet Metab. 2015 Mar;114(3):388-96. doi: 10.1016/j.ymgme.2014.11.016. Epub 2014 Dec 4.
7
Database Resources of the BIG Data Center in 2019.2019 年大数据中心数据库资源。
Nucleic Acids Res. 2019 Jan 8;47(D1):D8-D14. doi: 10.1093/nar/gky993.
8
Chado use case: storing genomic, genetic and breeding data of Rosaceae and Gossypium crops in Chado.Chado用例:在Chado中存储蔷薇科和棉属作物的基因组、遗传和育种数据。
Database (Oxford). 2016 Mar 17;2016. doi: 10.1093/database/baw010. Print 2016.
9
Smart breeding driven by big data, artificial intelligence, and integrated genomic-enviromic prediction.由大数据、人工智能和综合基因组-环境预测驱动的智能育种。
Mol Plant. 2022 Nov 7;15(11):1664-1695. doi: 10.1016/j.molp.2022.09.001. Epub 2022 Sep 7.
10
CottonGen: The Community Database for Cotton Genomics, Genetics, and Breeding Research.棉花基因组数据库(CottonGen):棉花基因组学、遗传学及育种研究的社区数据库。
Plants (Basel). 2021 Dec 18;10(12):2805. doi: 10.3390/plants10122805.

引用本文的文献

1
Harnessing Multi-Omics and Predictive Modeling for Climate-Resilient Crop Breeding: From Genomes to Fields.利用多组学和预测模型实现气候适应性作物育种:从基因组到田间
Genes (Basel). 2025 Jul 10;16(7):809. doi: 10.3390/genes16070809.
2
The Use of AI for Phenotype-Genotype Mapping.人工智能在表型-基因型映射中的应用。
Methods Mol Biol. 2025;2952:369-410. doi: 10.1007/978-1-0716-4690-8_21.
3
Improving plant breeding through AI-supported data integration.通过人工智能支持的数据整合改进植物育种。

本文引用的文献

1
JGI Plant Gene Atlas: an updateable transcriptome resource to improve functional gene descriptions across the plant kingdom.JGI 植物基因图谱:一个可更新的转录组资源,用于改善整个植物界的功能基因描述。
Nucleic Acids Res. 2023 Sep 8;51(16):8383-8401. doi: 10.1093/nar/gkad616.
2
Genetic mapping and prediction for novel lesion mimic in maize demonstrates quantitative effects from genetic background, environment and epistasis.玉米中新的病变模拟基因定位与预测表明,遗传背景、环境和上位性对数量性状有影响。
Theor Appl Genet. 2023 Jun 17;136(7):155. doi: 10.1007/s00122-023-04394-y.
3
Plant genome sequence assembly in the era of long reads: Progress, challenges and future directions.
Theor Appl Genet. 2025 Jun 2;138(6):132. doi: 10.1007/s00122-025-04910-2.
4
Data reuse in agricultural genomics research: challenges and recommendations.农业基因组学研究中的数据重用:挑战与建议。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae106.
5
Breeding species for resistance to disease in the Iberian Peninsula.在伊比利亚半岛培育抗病品种。
Front Plant Sci. 2024 Dec 9;15:1499185. doi: 10.3389/fpls.2024.1499185. eCollection 2024.
6
OrangeExpDB: an integrative gene expression database for Citrus spp.橙果表达数据库:柑橘属植物的综合基因表达数据库
BMC Genomics. 2024 May 27;25(1):521. doi: 10.1186/s12864-024-10445-5.
7
The Arabidopsis Information Resource in 2024.2024 年的拟南芥信息资源。
Genetics. 2024 May 7;227(1). doi: 10.1093/genetics/iyae027.
8
Plant Reactome Knowledgebase: empowering plant pathway exploration and OMICS data analysis.植物反应组知识库:赋能植物通路探索和 OMICS 数据分析。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1538-D1547. doi: 10.1093/nar/gkad1052.
9
Exploring Pan-Genomes: An Overview of Resources and Tools for Unraveling Structure, Function, and Evolution of Crop Genes and Genomes.探索泛基因组:解析作物基因和基因组结构、功能和进化的资源和工具概述。
Biomolecules. 2023 Sep 17;13(9):1403. doi: 10.3390/biom13091403.
长读长测序时代的植物基因组序列组装:进展、挑战与未来方向
Quant Plant Biol. 2022 Mar 11;3:e5. doi: 10.1017/qpb.2021.18. eCollection 2022.
4
Portable nanopore-sequencing technology: Trends in development and applications.便携式纳米孔测序技术:发展趋势与应用
Front Microbiol. 2023 Feb 1;14:1043967. doi: 10.3389/fmicb.2023.1043967. eCollection 2023.
5
Method of the year: long-read sequencing.年度方法:长读长测序。
Nat Methods. 2023 Jan;20(1):6-11. doi: 10.1038/s41592-022-01730-w.
6
Plant pan-genomics and its applications.植物泛基因组及其应用。
Mol Plant. 2023 Jan 2;16(1):168-186. doi: 10.1016/j.molp.2022.12.009. Epub 2022 Dec 15.
7
Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2023.2023 年中国国家生物信息中心国家基因组学数据中心数据库资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D18-D28. doi: 10.1093/nar/gkac1073.
8
Implementation of FAIR principles in the IPCC: the WGI AR6 Atlas repository.在 IPCC 中实施 FAIR 原则:WGI AR6 图谱知识库。
Sci Data. 2022 Oct 15;9(1):629. doi: 10.1038/s41597-022-01739-y.
9
End-to-End Fusion of Hyperspectral and Chlorophyll Fluorescence Imaging to Identify Rice Stresses.高光谱与叶绿素荧光成像的端到端融合以识别水稻胁迫
Plant Phenomics. 2022 Aug 2;2022:9851096. doi: 10.34133/2022/9851096. eCollection 2022.
10
Toward FAIR Representations of Microbial Interactions.迈向微生物相互作用的 FAIR 表示。
mSystems. 2022 Oct 26;7(5):e0065922. doi: 10.1128/msystems.00659-22. Epub 2022 Aug 25.