Grigoryev Dmitry N, Cheranova Dilyara I, Chaudhary Suman, Heruth Daniel P, Zhang Li Qin, Ye Shui Q
Laboratory of Translational Studies and Personalized Medicine, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russian Federation.
Division of Experimental and Translational Genetics, Department of Pediatrics, Children's Mercy Hospitals and Clinics, Kansas City, MO, USA.
BMC Pulm Med. 2015 Aug 19;15:95. doi: 10.1186/s12890-015-0088-x.
Accumulated to-date gene microarray data on Acute Respiratory Distress Syndrome (ARDS) in the Gene Expression Omnibus (GEO) represent a rich source for identifying new unsuspected targets and mechanisms of ARDS. The recently developed expression-based genome-wide association study (eGWAS) for analysis of GEO data was successfully used for analysis of gene expression of comparatively noncomplex adipose tissue, 75 % of which is represented by adipocytes. Although lung tissue is more heterogenic and does not possess a prevalent cell type for driving gene expression patterns, we hypothesized that eGWAS of ARDS samples will generate biologically meaningful results.
The eGWAS was conducted according to (Proc Natl Acad Sci U S A 109:7049-7054, 2012) and genes were ranked according to p values of chi-square test.
The search of GEO retrieved 487 ARDS related entries. These entries were filtered for multiple qualitative and quantitative conditions and 219 samples were selected: mouse n sham/ARDS = 67/92, rat n = 13/13, human cells n = 11/11, canine n = 6/6 with the following ARDS model distributions: mechanical ventilation (MV)/cyclic stretch n = 11; endotoxin (LPS) treatment n = 8; MV + LPS n = 3; distant organ injury induced ARDS n = 3; chemically induced ARDS n = 2; Staphylococcus aureus induced ARDS n = 2; and one experiment each for radiation and shock induced ARDS. The eGWAS of this dataset identified 42 significant (Bonferroni threshold P < 1.55 × 10(-6)) genes. 66.6 % of these genes, were associated previously with lung injury and include the well known ARDS genes such as IL1R2 (P = 4.42 × 10(-19)), IL1β (P = 3.38 × 10(-17)), PAI1 (P = 9.59 × 10(-14)), IL6 (P = 3.57 × 10(-12)), SOCS3 (P = 1.05 × 10(-10)), and THBS1 (P = 2.01 × 10(-9)). The remaining genes were new ARDS candidates. Expression of the most prominently upregulated genes, CLEC4E (P = 4.46 × 10(-14)) and CD300LF (P = 2.31 × 10(-16)), was confirmed by real time PCR. The former was also validated by in silico pathway analysis and the latter by Western blot analysis.
Our first in the field application of eGWAS in ARDS and utilization of more than 120 publicly available microarray samples of ARDS not only justified applicability of eGWAS to complex lung tissue, but also discovered 14 new candidate genes which associated with ARDS. Detailed studies of these new candidates might lead to identification of unsuspected evolutionarily conserved mechanisms triggered by ARDS.
基因表达综合数据库(GEO)中目前积累的急性呼吸窘迫综合征(ARDS)基因芯片数据是识别ARDS新的意外靶点和机制的丰富来源。最近开发的基于表达的全基因组关联研究(eGWAS)用于分析GEO数据,已成功用于分析相对不复杂的脂肪组织的基因表达,其中75%由脂肪细胞组成。尽管肺组织具有更高的异质性,且不存在驱动基因表达模式的优势细胞类型,但我们推测ARDS样本的eGWAS将产生具有生物学意义的结果。
根据(《美国国家科学院院刊》109:7049 - 7054, 2012)进行eGWAS,并根据卡方检验的p值对基因进行排序。
在GEO中检索到487个与ARDS相关的条目。对这些条目进行了多种定性和定量条件的筛选,共选择了219个样本:小鼠假手术/ARDS组分别为67/92,大鼠组为13/13,人细胞组为11/11,犬组为6/6,其ARDS模型分布如下:机械通气(MV)/循环拉伸组为11;内毒素(LPS)处理组为8;MV + LPS组为3;远处器官损伤诱导的ARDS组为3;化学诱导的ARDS组为2;金黄色葡萄球菌诱导的ARDS组为2;以及辐射和休克诱导的ARDS各有1个实验。该数据集的eGWAS鉴定出42个显著(Bonferroni阈值P < 1.55×10⁻⁶)基因。其中66.6%的基因先前与肺损伤相关,包括众所周知的ARDS相关基因,如IL1R2(P = 4.42×10⁻¹⁹)、IL1β(P = 3.38×10⁻¹⁷)、PAI1(P = 9.59×10⁻¹⁴)、IL6(P = 3.57×10⁻¹²)、SOCS3(P = 1.05×10⁻¹⁰)和THBS1(P = 2.01×10⁻⁹)。其余基因是新的ARDS候选基因。通过实时PCR证实了上调最显著的基因CLEC4E(P = 4.46×10⁻¹⁴)和CD300LF(P = 2.31×10⁻¹⁶)的表达。前者还通过计算机通路分析进行了验证,后者通过蛋白质印迹分析进行了验证。
我们在该领域首次将eGWAS应用于ARDS,并利用120多个公开可用的ARDS芯片样本,不仅证明了eGWAS适用于复杂的肺组织,还发现了14个与ARDS相关的新候选基因。对这些新候选基因的详细研究可能会导致识别出由ARDS触发的意外的进化保守机制。