College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China.
Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai 519041, China.
Int J Mol Sci. 2019 Mar 14;20(6):1284. doi: 10.3390/ijms20061284.
Non-coding RNAs with a length of more than 200 nucleotides are long non-coding RNAs (lncRNAs), which have gained tremendous attention in recent decades. Many studies have confirmed that lncRNAs have important influence in post-transcriptional gene regulation; for example, lncRNAs affect the stability and translation of splicing factor proteins. The mutations and malfunctions of lncRNAs are closely related to human disorders. As lncRNAs interact with a variety of proteins, predicting the interaction between lncRNAs and proteins is a significant way to depth exploration functions and enrich annotations of lncRNAs. Experimental approaches for lncRNA⁻protein interactions are expensive and time-consuming. Computational approaches to predict lncRNA⁻protein interactions can be grouped into two broad categories. The first category is based on sequence, structural information and physicochemical property. The second category is based on network method through fusing heterogeneous data to construct lncRNA related heterogeneous network. The network-based methods can capture the implicit feature information in the topological structure of related biological heterogeneous networks containing lncRNAs, which is often ignored by sequence-based methods. In this paper, we summarize and discuss the materials, interaction score calculation algorithms, advantages and disadvantages of state-of-the-art algorithms of lncRNA⁻protein interaction prediction based on network methods to assist researchers in selecting a suitable method for acquiring more dependable results. All the related different network data are also collected and processed in convenience of users, and are available at https://github.com/HAN-Siyu/APINet/.
长度超过 200 个核苷酸的非编码 RNA 是长非编码 RNA(lncRNA),近年来受到了极大的关注。许多研究证实,lncRNA 在转录后基因调控中具有重要影响;例如,lncRNA 影响剪接因子蛋白的稳定性和翻译。lncRNA 的突变和功能失调与人类疾病密切相关。由于 lncRNA 与多种蛋白质相互作用,预测 lncRNA 与蛋白质之间的相互作用是深入探索 lncRNA 功能和丰富其注释的重要途径。lncRNA-蛋白质相互作用的实验方法既昂贵又耗时。预测 lncRNA-蛋白质相互作用的计算方法可以分为两大类。第一类基于序列、结构信息和物理化学性质。第二类是基于网络方法,通过融合异构数据来构建与 lncRNA 相关的异构网络。基于网络的方法可以捕获包含 lncRNA 的相关生物异构网络拓扑结构中的隐含特征信息,这通常是序列方法所忽略的。在本文中,我们总结和讨论了基于网络方法预测 lncRNA-蛋白质相互作用的最新算法的材料、相互作用评分计算算法、优缺点,以帮助研究人员选择合适的方法来获得更可靠的结果。所有相关的不同网络数据也被收集和处理,方便用户使用,可在 https://github.com/HAN-Siyu/APINet/ 上获取。