Rathore Abhishek, Singh Vikas K, Pandey Sarita K, Rao Chukka Srinivasa, Thakur Vivek, Pandey Manish K, Anil Kumar V, Das Roma Rani
International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
Adv Biochem Eng Biotechnol. 2018;164:277-292. doi: 10.1007/10_2017_56.
Agricultural disciplines are becoming data intensive and the agricultural research data generation technologies are becoming sophisticated and high throughput. On the one hand, high-throughput genotyping is generating petabytes of data; on the other hand, high-throughput phenotyping platforms are also generating data of similar magnitude. Under modern integrated crop breeding, scientists are working together by integrating genomic and phenomic data sets of huge data volumes on a routine basis. To manage such huge research data sets and use them appropriately in decision making, Data Management Analysis & Decision Support Tools (DMASTs) are a prerequisite. DMASTs are required for a range of operations including generating the correct breeding experiments, maintaining pedigrees, managing phenotypic data, storing and retrieving high-throughput genotypic data, performing analytics, including trial analysis, spatial adjustments, identifications of MTAs, predicting Genomic Breeding Values (GEBVs), and various selection indices. DMASTs are also a prerequisite for understanding trait dynamics, gene action, interactions, biology, GxE, and various other factors contributing to crop improvement programs by integrating data generated from various science streams. These tools have simplified scientists' lives and empowered them in terms of data storage, data retrieval, data analytics, data visualization, and sharing with other researchers and collaborators. This chapter focuses on availability, uses, and gaps in present-day DMASTs. Graphical Abstract.
农业学科正变得数据密集型,并且农业研究数据生成技术正变得复杂且高通量。一方面,高通量基因分型正在产生PB级的数据;另一方面,高通量表型分析平台也在产生类似数量级的数据。在现代综合作物育种中,科学家们日常会整合大量的基因组和表型组数据集来共同开展工作。为了管理如此庞大的研究数据集并在决策中恰当地使用它们,数据管理分析与决策支持工具(DMASTs)是必不可少的。DMASTs对于一系列操作都是必需的,包括开展正确的育种实验、维护系谱、管理表型数据、存储和检索高通量基因型数据、进行分析,包括试验分析、空间调整、标记-性状关联(MTA)鉴定、预测基因组育种值(GEBVs)以及各种选择指数。通过整合来自不同科学领域生成的数据,DMASTs也是理解性状动态、基因作用、相互作用、生物学、基因-环境互作(GxE)以及其他有助于作物改良计划的各种因素的前提条件。这些工具简化了科学家的工作,并在数据存储、数据检索、数据分析、数据可视化以及与其他研究人员和合作者共享方面赋予了他们能力。本章重点关注当今DMASTs的可用性、用途和差距。图形摘要。