Ma Lina, Zou Dong, Liu Lin, Shireen Huma, Abbasi Amir A, Bateman Alex, Xiao Jingfa, Zhao Wenming, Bao Yiming, Zhang Zhang
National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
Genomics Proteomics Bioinformatics. 2023 Oct;21(5):1054-1058. doi: 10.1016/j.gpb.2022.12.004. Epub 2022 Dec 23.
Biological databases serve as a global fundamental infrastructure for the worldwide scientific community, which dramatically aid the transformation of big data into knowledge discovery and drive significant innovations in a wide range of research fields. Given the rapid data production, biological databases continue to increase in size and importance. To build a catalog of worldwide biological databases, we curate a total of 5825 biological databases from 8931 publications, which are geographically distributed in 72 countries/regions and developed by 1975 institutions (as of September 20, 2022). We further devise a z-index, a novel index to characterize the scientific impact of a database, and rank all these biological databases as well as their hosting institutions and countries in terms of citation and z-index. Consequently, we present a series of statistics and trends of worldwide biological databases, yielding a global perspective to better understand their status and impact for life and health sciences. An up-to-date catalog of worldwide biological databases, as well as their curated meta-information and derived statistics, is publicly available at Database Commons (https://ngdc.cncb.ac.cn/databasecommons/).
生物数据库是全球科学界的一项基础性基础设施,极大地助力了大数据向知识发现的转化,并推动了广泛研究领域的重大创新。鉴于数据的快速产出,生物数据库的规模和重要性持续增长。为构建一份全球生物数据库目录,我们从8931篇出版物中精心挑选了共计5825个生物数据库,这些数据库分布于72个国家/地区,由1975个机构开发(截至2022年9月20日)。我们进一步设计了一个z指数,这是一种用于表征数据库科学影响力的新指数,并根据引用次数和z指数对所有这些生物数据库及其主办机构和国家进行排名。因此,我们展示了一系列全球生物数据库的统计数据和趋势,从而提供一个全局视角,以便更好地了解它们在生命科学和健康科学领域的地位和影响。一份最新的全球生物数据库目录及其精心整理的元信息和统计数据可在数据库共享平台(https://ngdc.cncb.ac.cn/databasecommons/)上公开获取。