Mishra Madhulika, Barck Lucas, Moreno Pablo, Heger Guillaume, Song Yuyao, Thornton Janet M, Papatheodorou Irene
European Molecular Biology Laboratory, European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
GSK, Gunnels Wood Road, Stevenage, Hertfordshire SG1 2NY, UK.
NAR Genom Bioinform. 2023 Mar 3;5(1):lqad014. doi: 10.1093/nargab/lqad014. eCollection 2023 Mar.
Bulk transcriptomes are an essential data resource for understanding basic and disease biology. However, integrating information from different experiments remains challenging because of the batch effect generated by various technological and biological variations in the transcriptome. Numerous batch-correction methods to deal with this batch effect have been developed in the past. However, a user-friendly workflow to select the most appropriate batch-correction method for the given set of experiments is still missing. We present the SelectBCM tool that prioritizes the most appropriate batch-correction method for a given set of bulk transcriptomic experiments, improving biological clustering and gene differential expression analysis. We demonstrate the applicability of the SelectBCM tool on analyses of real data for two common diseases, rheumatoid arthritis and osteoarthritis, and one example to characterize a biological state, where we performed a meta-analysis of the macrophage activation state. The R package is available at https://github.com/ebi-gene-expression-group/selectBCM.
批量转录组是理解基础生物学和疾病生物学的重要数据资源。然而,由于转录组中各种技术和生物学变异产生的批次效应,整合来自不同实验的信息仍然具有挑战性。过去已经开发了许多处理这种批次效应的批次校正方法。然而,仍然缺少一种用户友好的工作流程来为给定的实验集选择最合适的批次校正方法。我们展示了SelectBCM工具,该工具为给定的一组批量转录组实验优先选择最合适的批次校正方法,改善生物学聚类和基因差异表达分析。我们证明了SelectBCM工具在两种常见疾病(类风湿性关节炎和骨关节炎)的真实数据分析以及一个表征生物学状态的示例(我们对巨噬细胞激活状态进行了荟萃分析)中的适用性。该R包可在https://github.com/ebi-gene-expression-group/selectBCM获取。