Suppr超能文献

支持向量机分类器用于预测对小鼠胚胎干细胞自我更新和多能性重要的基因。

SVM classifier to predict genes important for self-renewal and pluripotency of mouse embryonic stem cells.

作者信息

Xu Huilei, Lemischka Ihor R, Ma'ayan Avi

机构信息

Department of Pharmacology and System Therapeutics, Mount Sinai School of Medicine, 1 Gustave L, Levy Place, New York, New York 10029, USA.

出版信息

BMC Syst Biol. 2010 Dec 21;4:173. doi: 10.1186/1752-0509-4-173.

Abstract

BACKGROUND

Mouse embryonic stem cells (mESCs) are derived from the inner cell mass of a developing blastocyst and can be cultured indefinitely in-vitro. Their distinct features are their ability to self-renew and to differentiate to all adult cell types. Genes that maintain mESCs self-renewal and pluripotency identity are of interest to stem cell biologists. Although significant steps have been made toward the identification and characterization of such genes, the list is still incomplete and controversial. For example, the overlap among candidate self-renewal and pluripotency genes across different RNAi screens is surprisingly small. Meanwhile, machine learning approaches have been used to analyze multi-dimensional experimental data and integrate results from many studies, yet they have not been applied to specifically tackle the task of predicting and classifying self-renewal and pluripotency gene membership.

RESULTS

For this study we developed a classifier, a supervised machine learning framework for predicting self-renewal and pluripotency mESCs stemness membership genes (MSMG) using support vector machines (SVM). The data used to train the classifier was derived from mESCs-related studies using mRNA microarrays, measuring gene expression in various stages of early differentiation, as well as ChIP-seq studies applied to mESCs profiling genome-wide binding of key transcription factors, such as Nanog, Oct4, and Sox2, to the regulatory regions of other genes. Comparison to other classification methods using the leave-one-out cross-validation method was employed to evaluate the accuracy and generality of the classification. Finally, two sets of candidate genes from genome-wide RNA interference screens are used to test the generality and potential application of the classifier.

CONCLUSIONS

Our results reveal that an SVM approach can be useful for prioritizing genes for functional validation experiments and complement the analyses of high-throughput profiling experimental data in stem cell research.

摘要

背景

小鼠胚胎干细胞(mESCs)源自发育中的囊胚内细胞团,可在体外无限培养。其显著特征是具有自我更新能力以及分化为所有成体细胞类型的能力。维持mESCs自我更新和多能性特征的基因是干细胞生物学家感兴趣的研究对象。尽管在鉴定和表征此类基因方面已取得重大进展,但相关基因列表仍不完整且存在争议。例如,不同RNA干扰筛选中候选自我更新和多能性基因之间的重叠程度小得出奇。同时,机器学习方法已被用于分析多维实验数据并整合众多研究结果,但尚未专门应用于预测和分类自我更新及多能性基因成员的任务。

结果

在本研究中,我们开发了一种分类器,这是一种使用支持向量机(SVM)预测mESCs干性成员基因(MSMG)自我更新和多能性的监督式机器学习框架。用于训练分类器的数据源自使用mRNA微阵列的mESCs相关研究,这些研究测量了早期分化各个阶段的基因表达,以及应用于mESCs的ChIP-seq研究,该研究在全基因组范围内分析关键转录因子(如Nanog、Oct4和Sox2)与其他基因调控区域的结合情况。采用留一法交叉验证方法与其他分类方法进行比较,以评估分类的准确性和通用性。最后,使用来自全基因组RNA干扰筛选的两组候选基因来测试分类器的通用性和潜在应用。

结论

我们的结果表明,SVM方法可用于为功能验证实验确定基因优先级,并补充干细胞研究中高通量分析实验数据的分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7beb/3019180/d1edb35d1677/1752-0509-4-173-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验