Gene Manipulation Laboratory, Department of Biotechnology and Medical Engineering, National Institute of Technology, Rourkela,769008, India.
Department of Computer Science and Engineering, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisbon, Portugal; INESC-ID, SW Algorithms and Tools for Constraint Solving Group, R. Alves Redol 9, 1000-029 Lisbon, Portugal.
Comput Methods Programs Biomed. 2020 Aug;192:105473. doi: 10.1016/j.cmpb.2020.105473. Epub 2020 Mar 20.
Male germline stem (GS) cells are responsible for the maintenance of spermatogenesis throughout the adult life of males. Upon appropriate in vitro culture conditions, these GS cells can undergo reprogramming to become germline pluripotent stem (GPS) cells with the loss of spermatogenic potential. In recent years, voluminous data of gene transcripts in GS and GPS cells have become available. However, the mechanism of reprogramming of GS cells into GPS cells remains elusive. This study was designed to develop a Boolean logical model of gene regulatory network (GRN) that might be involved in the reprogramming of GS cells into GPS cells.
The gene expression profile of GS and GPS cells (GSE ID: GSE11274 and GSE74151) were analyzed using R Bioconductor to identify differentially expressed genes (DEGs) and were functionally annotated with DAVID server. Potential pluripotent genes among the DEGs were then predicted using a combination of machine learning [Support Vector Machine (SVM)] and BLAST search. Protein isoforms were identified by pattern matching with UniProt database with in-house scripts written in C++. Both linear and non-linear interaction maps were generated using the STRING server. CellNet is used to study the relationship of GRNs between the GS and GPS cells. Finally, the GRNs involving all the genes from integrated methods and literature was constructed and qualitative modelling for reprogramming of GS to GPS cells were done by considering the discrete, asynchronous, multivalued logical formalism using the GINsim modeling and simulation tool.
Through the use of machine learning and logical modeling, the present study identified 3585 DEGs and 221 novel pluripotent genes including Tet1, Cdh1, Tfap2c, Etv4, Etv5, Prdm14, and Prdm10 in GPS cells. Pathway analysis revealed that important signaling pathways such as core pluripotency network, PI3K-Akt, WNT, GDNF and BMP4 signalling pathways were important for the reprogramming of GS cells to GPS cells. On the other hand, CellNet analysis of GRNs of GS and GPS cells revealed that GS cells were similar to gonads whereas GPS cells were similar to ESCs in gene expression profile. A logical regulatory model was developed, which showed that TGFβ negatively regulated the reprogramming of the GS to GPS cells, as confirmed by perturbations studies.
The study identified novel pluripotent genes involved in the reprogramming of GS cells into GPS cells. A multivalued logical model of cellular reprogramming is proposed, which suggests that reprogramming of GS cells to GPS cells involves signalling pathways namely LIF, GDNF, BMP4, and TGFβ along with some novel pluripotency genes.
雄性生殖干细胞(GS)负责维持男性成年后的精子发生。在适当的体外培养条件下,这些 GS 细胞可以通过重编程成为具有丧失精子发生潜能的生殖多能干细胞(GPS)。近年来,大量关于 GS 和 GPS 细胞的基因转录本数据已经可用。然而,GS 细胞向 GPS 细胞重编程的机制仍不清楚。本研究旨在开发一个可能参与 GS 细胞向 GPS 细胞重编程的基因调控网络(GRN)布尔逻辑模型。
使用 R Bioconductor 分析 GS 和 GPS 细胞的基因表达谱(GSE ID:GSE11274 和 GSE74151),以鉴定差异表达基因(DEGs),并使用 DAVID 服务器进行功能注释。然后,使用机器学习[支持向量机(SVM)]和 BLAST 搜索的组合预测 DEGs 中的潜在多能基因。使用 UniProt 数据库中的模式匹配和内部编写的 C++脚本识别蛋白质同工型。使用 STRING 服务器生成线性和非线性相互作用图。使用 CellNet 研究 GS 和 GPS 细胞之间的 GRN 关系。最后,通过考虑离散、异步、多值逻辑形式主义,使用 GINsim 建模和仿真工具,构建了综合方法和文献中所有基因的 GRN,并对 GS 向 GPS 细胞的重编程进行了定性建模。
通过使用机器学习和逻辑建模,本研究鉴定了 3585 个 DEGs 和 221 个新的多能基因,包括 GPS 细胞中的 Tet1、Cdh1、Tfap2c、Etv4、Etv5、Prdm14 和 Prdm10。通路分析显示,核心多能性网络、PI3K-Akt、WNT、GDNF 和 BMP4 信号通路等重要信号通路对 GS 细胞向 GPS 细胞的重编程非常重要。另一方面,GS 和 GPS 细胞的 GRN 的 CellNet 分析表明,GS 细胞在基因表达谱上与性腺相似,而 GPS 细胞与 ESCs 相似。开发了一个逻辑调节模型,该模型表明 TGFβ 负调节 GS 细胞向 GPS 细胞的重编程,这一点通过扰动研究得到了证实。
本研究鉴定了参与 GS 细胞向 GPS 细胞重编程的新的多能基因。提出了一个多值逻辑细胞重编程模型,该模型表明,GS 细胞向 GPS 细胞的重编程涉及 LIF、GDNF、BMP4 和 TGFβ 等信号通路以及一些新的多能基因。