Suppr超能文献

利用张量分解进行功能获得/功能丧失知识推理,以支持基因、突变和疾病的高阶链接发现。

GOF/LOF knowledge inference with tensor decomposition in support of high order link discovery for gene, mutation and disease.

作者信息

Zhou Kai Yin, Wang Yu Xing, Zhang Sheng, Gachloo Mina, Kim Jin Dong, Luo Qi, Cohen Kevin Bretonnel, Xia Jing Bo

机构信息

College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China.

Hubei Key Lab of Agricultural Bioinformatics, Huazhong Agricultural University, 430070, Wuhan, China.

出版信息

Math Biosci Eng. 2019 Feb 20;16(3):1376-1391. doi: 10.3934/mbe.2019067.

Abstract

For discovery of new usage of drugs, the function type of their target genes plays an important role, and the hypothesis of "Antagonist-GOF" and "Agonist-LOF" has laid a solid foundation for supporting drug repurposing. In this research, an active gene annotation corpus was used as training data to predict the gain-of-function or loss-of-function or unknown character of each human gene after variation events. Unlike the design of(entity, predicate, entity) triples in a traditional three way tensor, a four way and a five way tensor, GMFD-/GMAFD-tensor, were designed to represent higher order links among or among part of these entities: genes(G), mutations(M), functions(F), diseases( D) and annotation labels(A). A tensor decomposition algorithm, CP decomposition, was applied to the higher order tensor and to unveil the correlation among entities. Meanwhile, a state-of-the-art baseline tensor decomposition algorithm, RESCAL, was carried on the three way tensor as a comparing method. The result showed that CP decomposition on higher order tensor performed better than RESCAL on traditional three way tensor in recovering masked data and making predictions. In addition, The four way tensor was proved to be the best format for our issue. At the end, a case study reproducing two disease-gene-drug links(Myelodysplatic Syndromes-IL2RA-Aldesleukin, Lymphoma- IL2RA-Aldesleukin) presented the feasibility of our prediction model for drug repurposing.

摘要

对于发现药物的新用途,其靶基因的功能类型起着重要作用,“拮抗剂-功能获得”和“激动剂-功能丧失”假说为支持药物重新利用奠定了坚实基础。在本研究中,使用一个活性基因注释语料库作为训练数据,以预测变异事件后每个人类基因的功能获得、功能丧失或未知特征。与传统三元张量中(实体、谓词、实体)三元组的设计不同,设计了一个四元张量和一个五元张量,即GMFD-/GMAFD-张量,以表示这些实体(基因(G)、突变(M)、功能(F)、疾病(D)和注释标签(A))之间或部分实体之间的高阶联系。将一种张量分解算法,即CP分解,应用于高阶张量,以揭示实体之间的相关性。同时,将一种先进的基线张量分解算法,即RESCAL,应用于三元张量作为比较方法。结果表明,在恢复掩码数据和进行预测方面,高阶张量上的CP分解比传统三元张量上的RESCAL表现更好。此外,四元张量被证明是解决我们问题的最佳形式。最后,一个重现两个疾病-基因-药物联系(骨髓增生异常综合征-IL2RA-阿地白介素,淋巴瘤-IL2RA-阿地白介素)的案例研究展示了我们用于药物重新利用的预测模型的可行性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验