• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AgTC和AgETL:用于加强植物科学研究数据收集与管理的开源工具。

AgTC and AgETL: open-source tools to enhance data collection and management for plant science research.

作者信息

Vargas-Rojas Luis, Ting To-Chia, Rainey Katherine M, Reynolds Matthew, Wang Diane R

机构信息

Department of Agronomy, Purdue University, West Lafayette, IN, United States.

Wheat Physiology Group, International Maize and Wheat Improvement Center (CIMMYT), Texcoco, Mexico.

出版信息

Front Plant Sci. 2024 Feb 21;15:1265073. doi: 10.3389/fpls.2024.1265073. eCollection 2024.

DOI:10.3389/fpls.2024.1265073
PMID:38450403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10915008/
Abstract

Advancements in phenotyping technology have enabled plant science researchers to gather large volumes of information from their experiments, especially those that evaluate multiple genotypes. To fully leverage these complex and often heterogeneous data sets (i.e. those that differ in format and structure), scientists must invest considerable time in data processing, and data management has emerged as a considerable barrier for downstream application. Here, we propose a pipeline to enhance data collection, processing, and management from plant science studies comprising of two newly developed open-source programs. The first, called AgTC, is a series of programming functions that generates comma-separated values file templates to collect data in a standard format using either a lab-based computer or a mobile device. The second series of functions, AgETL, executes steps for an -- (ETL) data integration process where data are extracted from heterogeneously formatted files, transformed to meet standard criteria, and loaded into a database. There, data are stored and can be accessed for data analysis-related processes, including dynamic data visualization through web-based tools. Both AgTC and AgETL are flexible for application across plant science experiments without programming knowledge on the part of the domain scientist, and their functions are executed on Jupyter Notebook, a browser-based interactive development environment. Additionally, all parameters are easily customized from central configuration files written in the human-readable YAML format. Using three experiments from research laboratories in university and non-government organization (NGO) settings as test cases, we demonstrate the utility of AgTC and AgETL to streamline critical steps from data collection to analysis in the plant sciences.

摘要

表型分析技术的进步使植物科学研究人员能够从他们的实验中收集大量信息,特别是那些评估多个基因型的实验。为了充分利用这些复杂且通常异构的数据集(即格式和结构不同的数据集),科学家们必须在数据处理上投入大量时间,而数据管理已成为下游应用的一个重大障碍。在此,我们提出了一个管道,以增强植物科学研究中的数据收集、处理和管理,该管道由两个新开发的开源程序组成。第一个程序称为AgTC,它是一系列编程函数,可生成逗号分隔值文件模板,以便使用基于实验室的计算机或移动设备以标准格式收集数据。第二个函数系列AgETL执行提取、转换和加载(ETL)数据集成过程的步骤,即从异构格式的文件中提取数据,进行转换以符合标准标准,然后加载到数据库中。在数据库中,数据被存储起来,并可用于与数据分析相关的过程,包括通过基于网络的工具进行动态数据可视化。AgTC和AgETL都可以灵活应用于各种植物科学实验,领域科学家无需具备编程知识,并且它们的功能在基于浏览器的交互式开发环境Jupyter Notebook上执行。此外,所有参数都可以通过以人类可读的YAML格式编写的中央配置文件轻松定制。我们以大学研究实验室和非政府组织(NGO)环境中的三个实验作为测试案例,展示了AgTC和AgETL在简化植物科学中从数据收集到分析的关键步骤方面的效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/eac615d48355/fpls-15-1265073-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/068a13430315/fpls-15-1265073-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/684a6367508a/fpls-15-1265073-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/97302de13115/fpls-15-1265073-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/eac615d48355/fpls-15-1265073-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/068a13430315/fpls-15-1265073-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/684a6367508a/fpls-15-1265073-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/97302de13115/fpls-15-1265073-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/10915008/eac615d48355/fpls-15-1265073-g004.jpg

相似文献

1
AgTC and AgETL: open-source tools to enhance data collection and management for plant science research.AgTC和AgETL:用于加强植物科学研究数据收集与管理的开源工具。
Front Plant Sci. 2024 Feb 21;15:1265073. doi: 10.3389/fpls.2024.1265073. eCollection 2024.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Development of a user-friendly system for image processing of electron microscopy by integrating a web browser and PIONE with Eos.通过将网络浏览器和PIONE与Eos集成,开发一种用户友好的电子显微镜图像处理系统。
Microscopy (Oxf). 2014 Nov;63 Suppl 1:i32-i33. doi: 10.1093/jmicro/dfu070.
4
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
5
A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.一个用于与生物磁共振数据库接口的快速高效的Python库。
BMC Bioinformatics. 2017 Mar 17;18(1):175. doi: 10.1186/s12859-017-1580-5.
6
Extract, transform, load framework for the conversion of health databases to OMOP.健康数据库到 OMOP 的转换的提取、转换、加载框架。
PLoS One. 2022 Apr 11;17(4):e0266911. doi: 10.1371/journal.pone.0266911. eCollection 2022.
7
ASAS-NANP symposium: mathematical modeling in animal nutrition-Making sense of big data and machine learning: how open-source code can advance training of animal scientists.ASAS-NANP 研讨会:动物营养中的数学建模——从大数据和机器学习中得出意义:开源代码如何促进动物科学家的培训。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad317.
8
ESAP plus: a web-based server for EST-SSR marker development.ESAP plus:一个用于EST-SSR标记开发的基于网络的服务器。
BMC Genomics. 2016 Dec 22;17(Suppl 13):1035. doi: 10.1186/s12864-016-3328-4.
9
Experimental Directory Structure (Exdir): An Alternative to HDF5 Without Introducing a New File Format.实验目录结构(Exdir):一种无需引入新文件格式的HDF5替代方案。
Front Neuroinform. 2018 Apr 13;12:16. doi: 10.3389/fninf.2018.00016. eCollection 2018.
10
Developing Healthcare Data Analytics APPs with Open Data Science Tools.使用开放数据科学工具开发医疗保健数据分析应用程序。
Stud Health Technol Inform. 2017;235:176-180.

本文引用的文献

1
Quantifying physiological trait variation with automated hyperspectral imaging in rice.利用自动高光谱成像技术量化水稻生理性状变异
Front Plant Sci. 2023 Sep 20;14:1229161. doi: 10.3389/fpls.2023.1229161. eCollection 2023.
2
PhytoOracle: Scalable, modular phenomics data processing pipelines.植物表型组学数据库:可扩展的模块化植物表型组学数据处理管道。
Front Plant Sci. 2023 Mar 6;14:1112973. doi: 10.3389/fpls.2023.1112973. eCollection 2023.
3
Digitalization of potato breeding program: Improving data collection and management.马铃薯育种计划的数字化:改进数据收集与管理。
Heliyon. 2023 Jan 20;9(1):e12974. doi: 10.1016/j.heliyon.2023.e12974. eCollection 2023 Jan.
4
Breedbase: a digital ecosystem for modern plant breeding.Breedbase:一个现代化植物育种的数字生态系统。
G3 (Bethesda). 2022 Jul 6;12(7). doi: 10.1093/g3journal/jkac078.
5
Bridging the Gap Between Remote Sensing and Plant Phenotyping-Challenges and Opportunities for the Next Generation of Sustainable Agriculture.弥合遥感与植物表型分析之间的差距——下一代可持续农业面临的挑战与机遇
Front Plant Sci. 2021 Oct 22;12:749374. doi: 10.3389/fpls.2021.749374. eCollection 2021.
6
The Ontologies Community of Practice: A CGIAR Initiative for Big Data in Agrifood Systems.实践本体论社区:国际农业研究磋商组织在农业食品系统大数据方面的一项倡议。
Patterns (N Y). 2020 Sep 25;1(7):100105. doi: 10.1016/j.patter.2020.100105. eCollection 2020 Oct 9.
7
Unlocking the potential of plant phenotyping data through integration and data-driven approaches.通过整合和数据驱动方法释放植物表型数据的潜力。
Curr Opin Syst Biol. 2017 Aug;4:58-63. doi: 10.1016/j.coisb.2017.07.002.
8
Breeder friendly phenotyping.繁殖友好表型鉴定。
Plant Sci. 2020 Jun;295:110396. doi: 10.1016/j.plantsci.2019.110396. Epub 2020 Jan 18.
9
Enabling reusability of plant phenomic datasets with MIAPPE 1.1.利用MIAPPE 1.1实现植物表型组学数据集的可重复使用性。
New Phytol. 2020 Jul;227(1):260-273. doi: 10.1111/nph.16544. Epub 2020 Apr 25.
10
Dealing with multi-source and multi-scale information in plant phenomics: the ontology-driven Phenotyping Hybrid Information System.处理植物表型组学中的多源和多尺度信息:基于本体的表型混合信息系统。
New Phytol. 2019 Jan;221(1):588-601. doi: 10.1111/nph.15385. Epub 2018 Aug 28.