Gonçalves André N A, Lever Melissa, Russo Pedro S T, Gomes-Correia Bruno, Urbanski Alysson H, Pollara Gabriele, Noursadeghi Mahdad, Maracaja-Coutinho Vinicius, Nakaya Helder I
Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil.
Advanced Center for Chronic Diseases-ACCDiS, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile.
Front Genet. 2019 Oct 24;10:971. doi: 10.3389/fgene.2019.00971. eCollection 2019.
Transcriptome analyses have increased our understanding of the molecular mechanisms underlying human diseases. Most approaches aim to identify significant genes by comparing their expression values between healthy subjects and a group of patients with a certain disease. Given that studies normally contain few samples, the heterogeneity among individuals caused by environmental factors or undetected illnesses can impact gene expression analyses. We present a systematic analysis of sample heterogeneity in a variety of gene expression studies relating to inflammatory and infectious diseases and show that novel immunological insights may arise once heterogeneity is addressed. The perturbation score of samples is quantified using nonperturbed subjects (i.e., healthy subjects) as a reference group. Such a score allows us to detect outlying samples and subgroups of diseased patients and even assess the molecular perturbation of single cells infected with viruses. We also show how removal of outlying samples can improve the "signal" of the disease and impact detection of differentially expressed genes. The method is made available the mdp Bioconductor R package and as a user-friendly webtool, webMDP, available at http://mdp.sysbio.tools.
转录组分析增进了我们对人类疾病潜在分子机制的理解。大多数方法旨在通过比较健康受试者和一组患有某种疾病的患者之间的基因表达值来识别重要基因。鉴于研究通常包含的样本较少,环境因素或未检测到的疾病导致的个体间异质性会影响基因表达分析。我们对与炎症和感染性疾病相关的各种基因表达研究中的样本异质性进行了系统分析,并表明一旦解决了异质性问题,可能会产生新的免疫学见解。使用未受干扰的受试者(即健康受试者)作为参考组来量化样本的扰动分数。这样的分数使我们能够检测出患病患者的异常样本和亚组,甚至评估感染病毒的单个细胞的分子扰动。我们还展示了去除异常样本如何改善疾病的“信号”并影响差异表达基因的检测。该方法可通过mdp Bioconductor R包获得,也可作为用户友好的网络工具webMDP获得,网址为http://mdp.sysbio.tools。