Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, Hanover, New Hampshire, USA.
The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA.
mSystems. 2024 Nov 19;9(11):e0103024. doi: 10.1128/msystems.01030-24. Epub 2024 Oct 18.
E.PathDash facilitates re-analysis of gene expression data from pathogens clinically relevant to chronic respiratory diseases, including a total of 48 studies, 548 samples, and 404 unique treatment comparisons. The application enables users to assess broad biological stress responses at the KEGG pathway or gene ontology level and also provides data for individual genes. E.PathDash reduces the time required to gain access to data from multiple hours per data set to seconds. Users can download high-quality images such as volcano plots and boxplots, differential gene expression results, and raw count data, making it fully interoperable with other tools. Importantly, users can rapidly toggle between experimental comparisons and different studies of the same phenomenon, enabling them to judge the extent to which observed responses are reproducible. As a proof of principle, we invited two cystic fibrosis scientists to use the application to explore scientific questions relevant to their specific research areas. Reassuringly, pathway activation analysis recapitulated results reported in original publications, but it also yielded new insights into pathogen responses to changes in their environments, validating the utility of the application. All software and data are freely accessible, and the application is available at scangeo.dartmouth.edu/EPathDash.
Chronic respiratory illnesses impose a high disease burden on our communities and people with respiratory diseases are susceptible to robust bacterial infections from pathogens, including and , that contribute to morbidity and mortality. Public gene expression datasets generated from these and other pathogens are abundantly available and an important resource for synthesizing existing pathogenic research, leading to interventions that improve patient outcomes. However, it can take many hours or weeks to render publicly available datasets usable; significant time and skills are needed to clean, standardize, and apply reproducible and robust bioinformatic pipelines to the data. Through collaboration with two microbiologists, we have shown that E.PathDash addresses this problem, enabling them to elucidate pathogen responses to a variety of over 400 experimental conditions and generate mechanistic hypotheses for cell-level behavior in response to disease-relevant exposures, all in a fraction of the time.
E.PathDash 促进了对与慢性呼吸道疾病相关的临床病原体的基因表达数据的重新分析,其中包括总共 48 项研究、548 个样本和 404 个独特的治疗比较。该应用程序使用户能够在 KEGG 途径或基因本体论水平评估广泛的生物应激反应,并且还提供了单个基因的数据。E.PathDash 将从多个数据集获取数据所需的时间从每个数据集数小时减少到几秒钟。用户可以下载高质量的图像,如火山图和箱线图、差异基因表达结果和原始计数数据,使其与其他工具完全兼容。重要的是,用户可以在实验比较和同一现象的不同研究之间快速切换,从而判断观察到的反应的可重复性程度。作为一个原理验证,我们邀请了两位囊性纤维化科学家使用该应用程序探索与他们特定研究领域相关的科学问题。令人欣慰的是,途径激活分析再现了原始出版物中报道的结果,但它也为病原体对环境变化的反应提供了新的见解,验证了该应用程序的实用性。所有软件和数据均可免费获得,该应用程序可在 scangeo.dartmouth.edu/EPathDash 上获得。
慢性呼吸道疾病给我们的社区带来了很高的疾病负担,患有呼吸道疾病的人容易受到包括 和 在内的病原体的强烈细菌感染,这些感染导致发病率和死亡率。从这些病原体和其他病原体生成的公共基因表达数据集非常丰富,是综合现有致病研究的重要资源,有助于提出改善患者预后的干预措施。然而,使公共可用数据集可用可能需要数小时或数周的时间;需要大量的时间和技能来清理、标准化和应用可重复且稳健的生物信息学管道来处理数据。通过与两位微生物学家合作,我们表明 E.PathDash 解决了这个问题,使他们能够阐明病原体对 400 多种不同实验条件的反应,并针对与疾病相关的暴露产生细胞水平行为的机制假设,所有这些都只需要花费原来时间的一小部分。