Zhou Meng, Han Lu, Zhang Jiahui, Hao Dapeng, Cai Yuanpei, Wang Zhenzhen, Zhou Hui, Sun Jie
College of Life Science, Jilin University, Changchun 130012, P. R. China.
Mol Biosyst. 2014 Dec;10(12):3264-71. doi: 10.1039/c4mb00339j.
The complex traits of an organism are associated with a complex interplay between genetic factors (GFs) and environmental factors (EFs). However, compared with protein-coding genes and microRNAs, there is a paucity of computational methods and bioinformatic resource platform for understanding the associations between lncRNA and EF. In this study, we developed a novel computational method to identify potential associations between lncRNA and EF, and released LncEnvironmentDB, a user-friendly web-based database aiming to provide a comprehensive resource platform for lncRNA and EF. Topological analysis of EF-related networks revealed the small world, scale-free and modularity structure. We also found that lncRNA and EF significantly enriched interacting miRNAs are functionally more related by analyzing their related diseases, implying that the predicted lncRNA signature of EF can reflect the functional characteristics to some degree. Finally, we developed a random walk with a restart-based computational model (RWREFD) to predict potential disease-related EFs by integrating lncRNA-EF associations and EF-disease associations. The performance of RWREFD was evaluated by experimentally verified EF-disease associations based on leave-one-out cross-validation and achieved an AUC value of 0.71, which is higher than randomization test, indicating that the RWREFD method has a reliable and high accuracy of prediction. To the best of our knowledge, LncEnvironmentDB is the first attempt to predict and house the experimental and predicted associations between lncRNA and EF. LncEnvironmentDB is freely available on the web at http://bioinfo.hrbmu.edu.cn/lncefdb/.
生物体的复杂性状与遗传因素(GFs)和环境因素(EFs)之间的复杂相互作用相关。然而,与蛋白质编码基因和微小RNA相比,用于理解长链非编码RNA(lncRNA)与环境因素之间关联的计算方法和生物信息资源平台较为匮乏。在本研究中,我们开发了一种新的计算方法来识别lncRNA与环境因素之间的潜在关联,并发布了LncEnvironmentDB,这是一个基于网络的用户友好型数据库,旨在为lncRNA和环境因素提供一个全面的资源平台。对环境因素相关网络的拓扑分析揭示了小世界、无标度和模块化结构。我们还发现,通过分析lncRNA和环境因素显著富集的相互作用微小RNA的相关疾病,它们在功能上更相关,这意味着预测的环境因素lncRNA特征在一定程度上可以反映功能特征。最后,我们开发了一种基于重启的随机游走计算模型(RWREFD),通过整合lncRNA - 环境因素关联和环境因素 - 疾病关联来预测潜在的疾病相关环境因素。基于留一法交叉验证,通过实验验证的环境因素 - 疾病关联对RWREFD的性能进行了评估,其AUC值为0.71,高于随机化测试,表明RWREFD方法具有可靠且高精度的预测能力。据我们所知,LncEnvironmentDB是首次尝试预测和存储lncRNA与环境因素之间的实验性和预测性关联。LncEnvironmentDB可在网页http://bioinfo.hrbmu.edu.cn/lncefdb/上免费获取。