Fu Chen, Yang Shiping, Yang Xiaodi, Lian Xianyi, Huang Yan, Dong Xiaobao, Zhang Ziding
State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China.
Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
mSystems. 2020 Nov 3;5(6):e00960-20. doi: 10.1128/mSystems.00960-20.
Human immunodeficiency virus type 1 (HIV-1) depends on a class of host proteins called host dependency factors (HDFs) to facilitate its infection. So far experimental efforts have detected a certain number of HDFs, but the gene inventory of HIV-1 HDFs remains incomplete. Here, we implemented an existing network-based gene discovery strategy to predict HIV-1 HDFs. First, an encoding scheme based on a publicly available human tissue-specific gene functional network (GIANT; http://giant.princeton.edu/) was designed to convert each human gene into a 25,825-dimensional feature vector. Then, a random forest-based predictive model was trained on a data set containing 868 known HDFs and 1,736 non-HDFs. Through 5-fold cross-validation, an independent test, and comparison with one existing method, the proposed prediction method consistently revealed accurate and competitive performance. The highlight of our method should be ascribed to the introduction of the GIANT encoding scheme, which contains rich information regarding gene interactions. By merging known HDFs and genome-wide HDF prediction results, network analysis was conducted to catch the common patterns of HDFs in the context of the GIANT network. Interestingly, HDFs reveal significantly lower betweenness than HIV-1-interacting human proteins (i.e., HIV targets). In the meantime, the functional roles of HDFs were also examined by mapping all the HDF candidates into human protein complexes. Especially, we observed the frequent co-occurrence of HDFs and HIV targets at the protein complex level. Collectively, we hope the proposed prediction method not only can accelerate the HDF identification and antiviral drug target discovery, but also can provide some mechanistic insights into human-virus relationships. Identification of HIV-1 HDFs remains a crucial step to understand the complicated relationships between human and HIV-1. To complement the experimental identification of HDFs, we have implemented an existing network-based gene discovery strategy to predict HDFs from the human genome. The core idea of the proposed method is that the rich information deposited in host gene functional networks can be effectively utilized to infer the potential HDFs. We hope the proposed prediction method could further guide hypothesis-driven experimental efforts to interrogate human-HIV-1 relationships and provide new hints for the development of antiviral drugs to combat HIV-1 infection.
1型人类免疫缺陷病毒(HIV-1)依赖于一类名为宿主依赖因子(HDFs)的宿主蛋白来促进其感染。到目前为止,实验已经检测到了一定数量的HDFs,但HIV-1 HDFs的基因清单仍然不完整。在此,我们实施了一种现有的基于网络的基因发现策略来预测HIV-1 HDFs。首先,基于一个公开可用的人类组织特异性基因功能网络(GIANT;http://giant.princeton.edu/)设计了一种编码方案,将每个人类基因转换为一个25,825维的特征向量。然后,在一个包含868个已知HDFs和1,736个非HDFs的数据集上训练了一个基于随机森林的预测模型。通过5折交叉验证、独立测试以及与一种现有方法的比较,所提出的预测方法始终显示出准确且具有竞争力的性能。我们方法的亮点应归功于GIANT编码方案的引入,该方案包含了关于基因相互作用的丰富信息。通过合并已知的HDFs和全基因组HDF预测结果,在GIANT网络的背景下进行了网络分析,以捕捉HDFs的共同模式。有趣的是,HDFs的中介中心性显著低于与HIV-1相互作用的人类蛋白质(即HIV靶点)。与此同时,还通过将所有HDF候选物映射到人类蛋白质复合物中来研究HDFs的功能作用。特别是,我们在蛋白质复合物水平上观察到了HDFs和HIV靶点的频繁共现。总体而言,我们希望所提出的预测方法不仅能够加速HDF的鉴定和抗病毒药物靶点的发现,还能够为人类与病毒的关系提供一些机制性见解。鉴定HIV-1 HDFs仍然是理解人类与HIV-1之间复杂关系的关键一步。为了补充HDFs的实验鉴定,我们实施了一种现有的基于网络的基因发现策略,从人类基因组中预测HDFs。所提出方法的核心思想是,可以有效利用宿主基因功能网络中存储的丰富信息来推断潜在的HDFs。我们希望所提出的预测方法能够进一步指导基于假设的实验工作,以探究人类与HIV-1的关系,并为开发抗HIV-1感染的抗病毒药物提供新的线索。