Luo Dongmei, Zhang Chengdong, Fu Liwan, Zhang Yuening, Hu Yue-Qing
State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, China.
Department of Information and Computing Science, School of Mathematics and Physics, Anhui University of Technology, Ma'anshan, Anhui Province, China.
PeerJ. 2021 Jan 6;9:e10576. doi: 10.7717/peerj.10576. eCollection 2021.
Knowledge of similarities among diseases can contribute to uncovering common genetic mechanisms. Based on ranked gene lists, a couple of similarity measures were proposed in the literature. Notice that they may suffer from the determination of cutoff or heavy computational load, we propose a novel similarity score among diseases based on gene ranks. Simulation studies under various scenarios demonstrate that has better performance than existing rank-based similarity measures. Application of in gene expression data of 18 cancer types from The Cancer Genome Atlas shows that is superior in clarifying the genetic relationships among diseases and demonstrates the tendency to cluster the histologically or anatomically related cancers together, which is analogous to the pan-cancer studies. Moreover, with simpler form and faster computation is more robust for higher levels of noise than existing methods and provides a basis for future studies on genetic relationships among diseases. In addition, a measure is developed to gauge the magnitude of association of anindividual gene with diseases. By using the genes and biological processes significantly associated with colorectal cancer are detected.
了解疾病之间的相似性有助于揭示共同的遗传机制。基于排序后的基因列表,文献中提出了几种相似性度量方法。注意到它们可能存在截断值确定困难或计算量过大的问题,我们提出了一种基于基因排名的新型疾病间相似性得分。各种场景下的模拟研究表明,该得分比现有的基于排名的相似性度量方法具有更好的性能。将其应用于来自癌症基因组图谱的18种癌症类型的基因表达数据中,结果表明该得分在阐明疾病间的遗传关系方面具有优势,并显示出将组织学或解剖学相关癌症聚集在一起的趋势,这类似于泛癌研究。此外,该得分形式更简单、计算速度更快,对于更高水平的噪声比现有方法更稳健,为未来疾病间遗传关系的研究提供了基础。此外,还开发了一种度量方法来衡量单个基因与疾病的关联程度。通过使用该方法,检测到了与结直肠癌显著相关的基因和生物学过程。