Sun Lei, Luan Jinwen, Wang Jinbiao, Li Xiaoli, Zhang Wenqian, Ji Xiaohui, Liu Longhua, Wang Ru, Xu Bingxiang
School of Information Engineering, Yangzhou University, Yangzhou 225127, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.
School of Exercise and Health, Shanghai University of Sport, Shanghai 200438, China.
J Sport Health Sci. 2024 Sep 26;14:100992. doi: 10.1016/j.jshs.2024.100992.
Physical activity can regulate and affect gene expression in multiple tissues and cells. Recently, with the development of next-generation sequencing, a large number of RNA-sequencing (RNA-seq)-based gene expression profiles about physical activity have been shared in public resources; however, they are poorly curated and underutilized. To tackle this problem, we developed a data atlas of such data through comprehensive data collection, curation, and organization.
The data atlas, termed gene expression profiles of RNA-seq-based exercise responses (GEPREP), was built on a comprehensive collection of high-quality RNA-seq data on exercise responses. The metadata of each sample were manually curated. Data were uniformly processed and batch effects corrected. All the information was well organized in an easy-to-use website for free search, visualization, and download.
GEPREP now includes 69 RNA-seq datasets of pre- and post-exercise, comprising 26 human datasets (1120 samples) and 43 mouse datasets (1006 samples). Specifically, there were 977 (87.2 %) human samples of skeletal muscle and 143 (12.8 %) human samples of blood. There were also samples across 9 mice tissues with skeletal muscle (359, 35.7 %) and brain (280, 27.8 %) accounting for the main fractions. Metadata-including subject, exercise interventions, sampling sites, and post-processing methods-are also included. The metadata and gene expression profiles are freely accessible at http://www.geprep.org.cn/.
GEPREP is a comprehensive data atlas of RNA-seq-based gene expression profiles responding to exercise. With its reliable annotations and user-friendly interfaces, it has the potential to deepen our understanding of exercise physiology.
体育活动可调节并影响多个组织和细胞中的基因表达。近年来,随着下一代测序技术的发展,大量基于RNA测序(RNA-seq)的体育活动基因表达谱已在公共资源中共享;然而,这些数据整理不佳且未得到充分利用。为解决这一问题,我们通过全面的数据收集、整理和组织,开发了一个此类数据的图谱。
该图谱称为基于RNA-seq的运动反应基因表达谱(GEPREP),它建立在对运动反应的高质量RNA-seq数据进行全面收集的基础上。对每个样本的元数据进行了人工整理。数据进行了统一处理并校正了批次效应。所有信息都在一个易于使用的网站上进行了良好组织,以便免费搜索、可视化和下载。
GEPREP目前包括69个运动前后的RNA-seq数据集,其中包含26个人类数据集(1120个样本)和43个小鼠数据集(1006个样本)。具体而言,有977个(87.2%)人类骨骼肌样本和143个(12.8%)人类血液样本。还有来自9个小鼠组织的样本,其中骨骼肌(359个,35.7%)和大脑(280个,27.8%)占主要部分。元数据包括受试者、运动干预、采样部位和后处理方法等也都包含在内。元数据和基因表达谱可在http://www.geprep.org.cn/免费获取。
GEPREP是一个基于RNA-seq的运动反应基因表达谱的综合数据图谱。凭借其可靠的注释和用户友好的界面,它有潜力加深我们对运动生理学的理解。