Bioinformatics and Computational Biosciences Branch, Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA.
Metaorganism Immunity Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institute of Health, Bethesda, MD 20292, USA.
BMC Bioinformatics. 2020 Sep 3;21(1):378. doi: 10.1186/s12859-020-03694-0.
The improvements in genomics methods coupled with readily accessible high-throughput sequencing have contributed to our understanding of microbial species, metagenomes, infectious diseases and more. To maximize the impact of these genomics studies, it is important that data from biological samples will become publicly available with standardized metadata. The availability of data at public archives provides the hope that greater insights could be obtained through integration with multi-omics data, reproducibility of published studies, or meta-analyses of large diverse datasets. These datasets should include a description of the host, organism, environmental source of the specimen, spatial-temporal information and other relevant metadata, but unfortunately these attributes are often missing and when present, they show inconsistencies in the use of metadata standards and ontologies.
METAGENOTE ( https://metagenote.niaid.nih.gov ) is a web portal that greatly facilitates the annotation of samples from genomic studies and streamlines the submission process of sequencing files and metadata to the Sequence Read Archive (SRA) (Leinonen R, et al, Nucleic Acids Res, 39:D19-21, 2011) for public access. This platform offers a wide selection of packages for different types of biological and experimental studies with a special emphasis on the standardization of metadata reporting. These packages follow the guidelines from the MIxS standards developed by the Genomics Standard Consortium (GSC) and adopted by the three partners of the International Nucleotides Sequencing Database Collaboration (INSDC) (Cochrane G, et al, Nucleic Acids Res, 44:D48-50, 2016) - National Center for Biotechnology Information (NCBI), European Bioinformatics Institute (EBI) and the DNA Data Bank of Japan (DDBJ). METAGENOTE then compiles, validates and manages the submission through an easy-to-use web interface minimizing submission errors and eliminating the need for submitting sequencing files via a separate file transfer mechanism.
METAGENOTE is a public resource that focuses on simplifying the annotation and submission process of data with its corresponding metadata. Users of METAGENOTE will benefit from the easy to use annotation interface but most importantly will be encouraged to publish metadata following standards and ontologies that make the public data available for reuse.
基因组学方法的改进以及易于获取的高通量测序技术的发展,促进了我们对微生物物种、宏基因组、传染病等的理解。为了最大限度地发挥这些基因组学研究的作用,重要的是,生物样本的数据将以标准化元数据的形式公开提供。公共档案中数据的可用性,使人们希望通过与多组学数据的整合、已发表研究的可重复性,或对大型多样数据集的元分析,可以获得更深入的见解。这些数据集应包括宿主、生物体、样本的环境来源、时空信息和其他相关元数据的描述,但遗憾的是,这些属性经常缺失,而当存在时,它们在元数据标准和本体的使用上显示出不一致。
METAGENOTE(https://metagenote.niaid.nih.gov)是一个网络门户,它极大地方便了对基因组研究样本的注释,并简化了将测序文件和元数据提交到序列读取档案(SRA)(Leinonen R,等人,核酸研究,39:D19-21,2011)以供公开访问的过程。该平台提供了广泛的生物和实验研究类型的软件包选择,特别强调元数据报告的标准化。这些软件包遵循由基因组学标准联盟(GSC)制定的 MIxS 标准和三个国际核苷酸序列数据库协作(INSDC)合作伙伴(Cochrane G,等人,核酸研究,44:D48-50,2016)-美国国家生物技术信息中心(NCBI)、欧洲生物信息学研究所(EBI)和日本 DNA 数据库(DDBJ)采用的指南。METAGENOTE 然后通过一个易于使用的 Web 界面编译、验证和管理提交,最大限度地减少提交错误,并消除通过单独的文件传输机制提交测序文件的需要。
METAGENOTE 是一个专注于简化数据及其相应元数据注释和提交过程的公共资源。METAGENOTE 的用户将受益于易于使用的注释界面,但最重要的是,将鼓励他们按照使公共数据可重复使用的标准和本体发布元数据。