Yu Ying, Hou Wanwan, Chen Qingwang, Guo Xiaorou, Sang Leqing, Xue Hao, Wang Duo, Li Jinming, Fang Xiang, Zhang Rui, Dong Lianhua, Shi Leming, Zheng Yuanting
State Key Laboratory of Genetic Engineering, Human Phenome Institute and School of Life Sciences, Fudan University, Shanghai, China.
National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China.
Nat Protoc. 2025 Feb 18. doi: 10.1038/s41596-024-01111-x.
RNA reference materials and their corresponding reference datasets act as the 'ground truth' for the normalization of experimental values and are indispensable tools for reliably measuring intrinsically small differences in RNA-sequencing data, such as those between molecular subtypes of diseases in clinical samples. However, the variability in 'absolute' expression profiles measured across different batches, methods or platforms limits the use of conventional RNA reference datasets. We recently proposed a ratio-based method for constructing reference datasets. The ratio for a gene is defined as the normalized expression levels between two sample groups and produces more reliable values than the 'absolute' values obtained across diverse transcriptomic technologies and batches. Our gene ratios have been used for the successful generation of omics-wide reference datasets. Here, we describe a step-by-step process for establishing RNA reference materials and reference datasets, covering three stages: (1) reference materials, including material preparation, homogeneity testing and stability testing; (2) ratio-based reference datasets, including characterization, uncertainty estimation and orthogonal validation; and (3) applications, including definition of performance metrics, performing proficiency tests and diagnosing and correcting batch effects. This approach established the Quartet RNA reference materials and reference datasets (chinese-quartet.org) that have been approved as the first suite of nationally certified RNA reference materials by China's State Administration for Market Regulation. The protocol can be utilized to establish and apply reference materials to improve RNA-sequencing data quality in diverse clinical settings. The procedure can be completed in 2 d and requires expertise in molecular biology and bioinformatics.
RNA参考物质及其相应的参考数据集是实验值标准化的“基准真相”,是可靠测量RNA测序数据中内在微小差异(如临床样本中疾病分子亚型之间的差异)不可或缺的工具。然而,在不同批次、方法或平台上测量的“绝对”表达谱的变异性限制了传统RNA参考数据集的使用。我们最近提出了一种基于比率的方法来构建参考数据集。一个基因的比率定义为两个样本组之间的标准化表达水平,比通过不同转录组技术和批次获得的“绝对”值产生更可靠的值。我们的基因比率已成功用于生成全组学参考数据集。在这里,我们描述了建立RNA参考物质和参考数据集的逐步过程,涵盖三个阶段:(1)参考物质,包括物质制备、均匀性测试和稳定性测试;(2)基于比率的参考数据集,包括表征、不确定度估计和正交验证;(3)应用,包括性能指标定义、进行能力验证测试以及诊断和校正批次效应。这种方法建立了四重奏RNA参考物质和参考数据集(chinese - quartet.org),该数据集已被中国国家市场监督管理总局批准为第一套国家级认证的RNA参考物质。该方案可用于建立和应用参考物质,以提高不同临床环境中RNA测序数据的质量。该过程可在2天内完成,需要分子生物学和生物信息学方面的专业知识。