Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA.
Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
Sci Data. 2024 Feb 10;11(1):182. doi: 10.1038/s41597-024-03031-7.
More than two hundred papers have reported genome-wide data from ancient humans. While the raw data for the vast majority are fully publicly available testifying to the commitment of the paleogenomics community to open data, formats for both raw data and meta-data differ. There is thus a need for uniform curation and a centralized, version-controlled compendium that researchers can download, analyze, and reference. Since 2019, we have been maintaining the Allen Ancient DNA Resource (AADR), which aims to provide an up-to-date, curated version of the world's published ancient human DNA data, represented at more than a million single nucleotide polymorphisms (SNPs) at which almost all ancient individuals have been assayed. The AADR has gone through six public releases at the time of writing and review of this manuscript, and crossed the threshold of >10,000 individuals with published genome-wide ancient DNA data at the end of 2022. This note is intended as a citable descriptor of the AADR.
已有两百多篇论文报告了古人类的全基因组数据。尽管绝大多数原始数据完全公开,证明了古基因组学界对开放数据的承诺,但原始数据和元数据的格式却有所不同。因此,需要进行统一的策展工作,并建立一个集中的、版本控制的纲要,研究人员可以下载、分析和参考这些数据。自 2019 年以来,我们一直在维护艾伦古 DNA 资源(AADR),其旨在提供世界上已发表的古人类 DNA 数据的最新、经过策展的版本,这些数据代表了超过 100 万个单核苷酸多态性(SNP),几乎所有的古人类个体都进行了检测。截至撰写和审查本手稿时,AADR 已经进行了六次公开发布,并在 2022 年底,古人类全基因组 DNA 数据的发表量超过了 10000 人。本说明旨在作为 AADR 的可引用描述符。