COPO - 管理生物多样性的样本元数据：来自达尔文生命之树项目的考量

COPO - Managing sample metadata for biodiversity: considerations from the Darwin Tree of Life project.

作者信息

Shaw Felix, Minotto Alice, McTaggart Seanna, Providence Aaliyah, Harrison Peter, Paupério Joana, Rajan Jeena, Burgin Josephine, Cochrane Guy, Kilias Estelle, Lawniczak Mara K N, Davey Robert

机构信息

Earlham Institute, Norwich, Norfolk, NR4 7UH, UK.

EMBL European Bioinformatics Institute, Hinxton, Cambridgeshire, CB10 1SD, UK.

出版信息

Wellcome Open Res. 2024 Jun 10;7:279. doi: 10.12688/wellcomeopenres.18499.2. eCollection 2022.

DOI:10.12688/wellcomeopenres.18499.2

PMID:39091415

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11292180/

Abstract

Large-scale reference genome sequencing projects for all of biodiversity are underway and common standards have been in place for some years to enable the understanding and sharing of sequence data. However, the metadata that describes the collection, processing and management of samples, and link to the associated sequencing and genome data, are not yet adequately developed and standardised for these projects. At the time of writing, the Darwin Tree of Life (DToL) Project is over two years into its ten-year ambition to sequence all described eukaryotic species in Britain and Ireland. We have sought consensus from a wide range of scientists across taxonomic domains to determine the minimal set of metadata that we collectively deem as critically important to accompany each sequenced specimen. These metadata are made available throughout the subsequent laboratory processes, and once collected, need to be adequately managed to fulfil the requirements of good data management practice. Due to the size and scale of management required, software tools are needed. These tools need to implement rigorous development pathways and change management procedures to ensure that effective research data management of key project and sample metadata is maintained. Tracking of sample properties through the sequencing process is handled by Lab Information Management Systems (LIMS), so publication of the sequenced data is achieved via technical integration of LIMS and data management tools. Discussions with community members on how metadata standards need to be managed within large-scale programmes is a priority in the planning process. Here we report on the standards we developed with respect to a robust and reusable mechanism of metadata collection, in the hopes that other projects forthcoming or underway will adopt these practices for metadata.

摘要

针对所有生物多样性的大规模参考基因组测序项目正在进行中，并且通用标准已经实施了数年，以促进序列数据的理解和共享。然而，描述样本的采集、处理和管理，并与相关测序和基因组数据建立关联的元数据，在这些项目中尚未得到充分开发和标准化。在撰写本文时，达尔文生命之树（DToL）项目已经开展了两年多，其目标是在十年内对英国和爱尔兰所有已描述的真核生物物种进行测序。我们已寻求各分类领域的众多科学家达成共识，以确定我们共同认为对每个测序样本至关重要的最少元数据集。这些元数据在后续的实验室流程中均可获取，并且一旦收集，就需要进行妥善管理，以满足良好数据管理实践的要求。由于所需管理的规模和范围较大，因此需要软件工具。这些工具需要实施严格的开发路径和变更管理程序，以确保对关键项目和样本元数据进行有效的研究数据管理。样本属性在测序过程中的跟踪由实验室信息管理系统（LIMS）处理，因此测序数据的发布是通过LIMS与数据管理工具的技术集成来实现的。在规划过程中，与社区成员讨论如何在大规模项目中管理元数据标准是一项优先事项。在此，我们报告我们针对强大且可重复使用的元数据收集机制所制定的标准，希望其他即将开展或正在进行的项目将采用这些元数据管理做法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d102/11349316/5e3141aa8af5/wellcomeopenres-7-24733-g0000.jpg

相似文献

COPO - Managing sample metadata for biodiversity: considerations from the Darwin Tree of Life project.COPO - 管理生物多样性的样本元数据：来自达尔文生命之树项目的考量

Wellcome Open Res. 2024 Jun 10;7:279. doi: 10.12688/wellcomeopenres.18499.2. eCollection 2022.

The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Short-Term Memory Impairment短期记忆障碍

Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗：一项系统综述

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Perceptions and experiences of the prevention, detection, and management of postpartum haemorrhage: a qualitative evidence synthesis.预防、检测和管理产后出血的认知和经验：定性证据综合。

Cochrane Database Syst Rev. 2023 Nov 27;11(11):CD013795. doi: 10.1002/14651858.CD013795.pub2.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Factors that impact on the use of mechanical ventilation weaning protocols in critically ill adults and children: a qualitative evidence-synthesis.影响重症成人和儿童机械通气撤机方案使用的因素：一项定性证据综合分析

Cochrane Database Syst Rev. 2016 Oct 4;10(10):CD011812. doi: 10.1002/14651858.CD011812.pub2.

The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历：系统检索与综述

Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.

Laboratory-based molecular test alternatives to RT-PCR for the diagnosis of SARS-CoV-2 infection.基于实验室的分子检测替代 RT-PCR 用于 SARS-CoV-2 感染的诊断。

Cochrane Database Syst Rev. 2024 Oct 14;10(10):CD015618. doi: 10.1002/14651858.CD015618.

Psychological and/or educational interventions for the prevention of depression in children and adolescents.预防儿童和青少年抑郁症的心理和/或教育干预措施。

Cochrane Database Syst Rev. 2004(1):CD003380. doi: 10.1002/14651858.CD003380.pub2.

引用本文的文献

Best-practice guidance for Earth BioGenome Project sample collection and processing: progress and challenges in biodiverse reference genome creation.地球生物基因组计划样本采集与处理的最佳实践指南：创建生物多样性参考基因组的进展与挑战

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf041.

A DNA barcoding framework for taxonomic verification in the Darwin Tree of Life Project.达尔文生命之树项目中用于分类学验证的DNA条形码框架。

Wellcome Open Res. 2024 Jun 24;9:339. doi: 10.12688/wellcomeopenres.21143.1. eCollection 2024.

The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics.欧洲参考基因组图谱：试行一种分散式的公平生物多样性基因组学方法。

NPJ Biodivers. 2024 Sep 17;3(1):28. doi: 10.1038/s44185-024-00054-6.

Contextualising samples: supporting reference genomes of European biodiversity through sample and associated metadata collection.样本情境化：通过样本及相关元数据收集来支持欧洲生物多样性的参考基因组

NPJ Biodivers. 2024 Sep 17;3(1):26. doi: 10.1038/s44185-024-00053-7.

Current stewardship practices in invasion biology limit the value and secondary use of genomic data.入侵生物学当前的管理实践限制了基因组数据的价值和二次利用。

Mol Ecol Resour. 2025 Jul;25(5):e13858. doi: 10.1111/1755-0998.13858. Epub 2023 Aug 30.

Challenges to sharing sample metadata in computational genomics.计算基因组学中样本元数据共享面临的挑战。

Front Genet. 2023 May 23;14:1154198. doi: 10.3389/fgene.2023.1154198. eCollection 2023.

The European Nucleotide Archive in 2022.2022 年的欧洲核苷酸档案库。

Nucleic Acids Res. 2023 Jan 6;51(D1):D121-D125. doi: 10.1093/nar/gkac1051.

本文引用的文献

Importance of timely metadata curation to the global surveillance of genetic diversity.及时进行元数据策管对全球遗传多样性监测的重要性。

Conserv Biol. 2023 Aug;37(4):e14061. doi: 10.1111/cobi.14061. Epub 2023 Mar 10.

The BioImage Archive - Building a Home for Life-Sciences Microscopy Data.生物影像归档 - 为生命科学显微镜数据构建一个家。

J Mol Biol. 2022 Jun 15;434(11):167505. doi: 10.1016/j.jmb.2022.167505. Epub 2022 Feb 18.

Sequence locally, think globally: The Darwin Tree of Life Project.就地测序，放眼全球：达尔文生命之树计划。

Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2115642118.

The Earth BioGenome Project 2020: Starting the clock.地球生物基因组计划2020：开启计时。

Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2115635118.

The European Nucleotide Archive in 2021.2021 年的欧洲核苷酸档案库。

Nucleic Acids Res. 2022 Jan 7;50(D1):D106-D110. doi: 10.1093/nar/gkab1051.

BioSamples database: FAIRer samples metadata to accelerate research data management.生物样本数据库：FAIRer 样本元数据加速研究数据管理。

Nucleic Acids Res. 2022 Jan 7;50(D1):D1500-D1507. doi: 10.1093/nar/gkab1046.

Poor data stewardship will hinder global genetic diversity surveillance.糟糕的数据管理将阻碍全球遗传多样性监测。

Proc Natl Acad Sci U S A. 2021 Aug 24;118(34). doi: 10.1073/pnas.2107934118.

Building a global genomics observatory: Using GEOME (the Genomic Observatories Metadatabase) to expedite and improve deposition and retrieval of genetic data and metadata for biodiversity research.建立一个全球基因组观测站：利用 GEOME（基因组观测站元数据库）加快和改进生物多样性研究遗传数据和元数据的存储和检索。

Mol Ecol Resour. 2020 Nov;20(6):1458-1469. doi: 10.1111/1755-0998.13269. Epub 2020 Oct 27.

Ten simple rules for annotating sequencing experiments.注释测序实验的十条简单规则。

PLoS Comput Biol. 2020 Oct 5;16(10):e1008260. doi: 10.1371/journal.pcbi.1008260. eCollection 2020 Oct.

Biocuration: Distilling data into knowledge.生物信息学数据管理：从数据中提取知识。

PLoS Biol. 2018 Apr 16;16(4):e2002846. doi: 10.1371/journal.pbio.2002846. eCollection 2018 Apr.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

COPO - 管理生物多样性的样本元数据：来自达尔文生命之树项目的考量

COPO - Managing sample metadata for biodiversity: considerations from the Darwin Tree of Life project.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献