• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于表达谱的人类必需基因和癌症细胞中候选 lncRNA 的预测。

Expression-based prediction of human essential genes and candidate lncRNAs in cancer cells.

机构信息

Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.

Department of Biological Sciences, Clemson University, Clemson, SC 29634, USA.

出版信息

Bioinformatics. 2021 Apr 20;37(3):396-403. doi: 10.1093/bioinformatics/btaa717.

DOI:10.1093/bioinformatics/btaa717
PMID:32790840
Abstract

MOTIVATION

Essential genes are required for the reproductive success at either cellular or organismal level. The identification of essential genes is important for understanding the core biological processes and identifying effective therapeutic drug targets. However, experimental identification of essential genes is costly, time consuming and labor intensive. Although several machine learning models have been developed to predict essential genes, these models are not readily applicable to lncRNAs. Moreover, the currently available models cannot be used to predict essential genes in a specific cancer type.

RESULTS

In this study, we have developed a new machine learning approach, XGEP (eXpression-based Gene Essentiality Prediction), to predict essential genes and candidate lncRNAs in cancer cells. The novelty of XGEP lies in the utilization of relevant features derived from the TCGA transcriptome dataset through collaborative embedding. When evaluated on the pan-cancer dataset, XGEP was able to accurately predict human essential genes and achieve significantly higher performance than previous models. Notably, several candidate lncRNAs selected by XGEP are reported to promote cell proliferation and inhibit cell apoptosis. Moreover, XGEP also demonstrated superior performance on cancer-type-specific datasets to identify essential genes. The comprehensive lists of candidate essential genes in specific cancer types may be used to guide experimental characterization and facilitate the discovery of drug targets for cancer therapy.

AVAILABILITY AND IMPLEMENTATION

The source code and datasets used in this study are freely available at https://github.com/BioDataLearning/XGEP.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

对于细胞或机体水平的生殖成功而言,必需基因是必需的。必需基因的鉴定对于理解核心生物过程和确定有效的治疗药物靶点非常重要。然而,必需基因的实验鉴定既昂贵又耗时且费力。尽管已经开发了几种机器学习模型来预测必需基因,但这些模型不适用于 lncRNA。此外,目前可用的模型不能用于预测特定癌症类型中的必需基因。

结果

在这项研究中,我们开发了一种新的机器学习方法,即 XGEP(基于表达的基因必需性预测),用于预测癌细胞中的必需基因和候选 lncRNA。XGEP 的新颖之处在于通过协作嵌入利用来自 TCGA 转录组数据集的相关特征。在泛癌数据集上进行评估时,XGEP 能够准确预测人类必需基因,并取得了显著优于以前模型的性能。值得注意的是,XGEP 选择的几个候选 lncRNA 据报道可促进细胞增殖并抑制细胞凋亡。此外,XGEP 在癌症类型特异性数据集上也表现出优越的性能,可用于识别必需基因。特定癌症类型中候选必需基因的综合列表可用于指导实验表征并促进癌症治疗药物靶点的发现。

可用性和实现

本研究中使用的源代码和数据集可在 https://github.com/BioDataLearning/XGEP 上免费获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Expression-based prediction of human essential genes and candidate lncRNAs in cancer cells.基于表达谱的人类必需基因和癌症细胞中候选 lncRNA 的预测。
Bioinformatics. 2021 Apr 20;37(3):396-403. doi: 10.1093/bioinformatics/btaa717.
2
Prediction and prioritization of autism-associated long non-coding RNAs using gene expression and sequence features.使用基因表达和序列特征预测和优先考虑自闭症相关的长非编码 RNA。
BMC Bioinformatics. 2020 Nov 7;21(1):505. doi: 10.1186/s12859-020-03843-5.
3
Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用:综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.
4
Inferring disease-associated long non-coding RNAs using genome-wide tissue expression profiles.利用全基因组组织表达谱推断疾病相关的长非编码 RNA。
Bioinformatics. 2019 May 1;35(9):1494-1502. doi: 10.1093/bioinformatics/bty859.
5
CRlncRC: a machine learning-based method for cancer-related long noncoding RNA identification using integrated features.CRlncRC:一种基于机器学习的方法,利用整合特征识别癌症相关长链非编码RNA
BMC Med Genomics. 2018 Dec 31;11(Suppl 6):120. doi: 10.1186/s12920-018-0436-9.
6
Genome-wide identification of the essential protein-coding genes and long non-coding RNAs for human pan-cancer.全基因组鉴定人类泛癌必需的蛋白质编码基因和长非编码 RNA
Bioinformatics. 2019 Nov 1;35(21):4344-4349. doi: 10.1093/bioinformatics/btz230.
7
Prediction of lncRNA-disease associations based on inductive matrix completion.基于归纳矩阵补全的 lncRNA-疾病关联预测。
Bioinformatics. 2018 Oct 1;34(19):3357-3364. doi: 10.1093/bioinformatics/bty327.
8
LncDC: a machine learning-based tool for long non-coding RNA detection from RNA-Seq data.LncDC:一种基于机器学习的 RNA-Seq 数据中长非编码 RNA 检测工具。
Sci Rep. 2022 Nov 9;12(1):19083. doi: 10.1038/s41598-022-22082-7.
9
DysRegSig: an R package for identifying gene dysregulations and building mechanistic signatures in cancer.DysRegSig:一个用于识别癌症中基因失调并构建机制特征的 R 包。
Bioinformatics. 2021 Apr 20;37(3):429-430. doi: 10.1093/bioinformatics/btaa688.
10
Predicting lncRNA-disease associations using network topological similarity based on deep mining heterogeneous networks.基于深度挖掘异质网络的网络拓扑相似性预测 lncRNA-疾病关联。
Math Biosci. 2019 Sep;315:108229. doi: 10.1016/j.mbs.2019.108229. Epub 2019 Jul 16.

引用本文的文献

1
TransCell: In Silico Characterization of Genomic Landscape and Cellular Responses by Deep Transfer Learning.TransCell:基于深度迁移学习的基因组景观和细胞反应的计算特征分析。
Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzad008.
2
Untangling the Context-Specificity of Essential Genes by Means of Machine Learning: A Constructive Experience.通过机器学习理清必需基因的语境特异性:一种建设性的经验。
Biomolecules. 2023 Dec 22;14(1):18. doi: 10.3390/biom14010018.
3
Learning biologically-interpretable latent representations for gene expression data: Pathway Activity Score Learning Algorithm.
学习基因表达数据的生物可解释潜在表示:通路活性评分学习算法。
Mach Learn. 2023;112(11):4257-4287. doi: 10.1007/s10994-022-06158-z. Epub 2022 Apr 29.
4
Tumor type classification and candidate cancer-specific biomarkers discovery via semi-supervised learning.通过半监督学习进行肿瘤类型分类和候选癌症特异性生物标志物发现。
Biophys Rep. 2023 Apr 30;9(2):57-66. doi: 10.52601/bpr.2023.230005.
5
Identification of discriminant features from stationary pattern of nucleotide bases and their application to essential gene classification.从核苷酸碱基的固定模式中识别判别特征及其在必需基因分类中的应用。
Front Genet. 2023 Apr 20;14:1154120. doi: 10.3389/fgene.2023.1154120. eCollection 2023.
6
DeepCellEss: cell line-specific essential protein prediction with attention-based interpretable deep learning.DeepCellEss:基于注意力机制的可解释深度学习的细胞系特异性必需蛋白预测。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac779.
7
ELIMINATOR: essentiality analysis using multisystem networks and integer programming.ELIMINATOR:使用多系统网络和整数规划进行必需性分析。
BMC Bioinformatics. 2022 Aug 6;23(1):324. doi: 10.1186/s12859-022-04855-z.
8
GSEA-SDBE: A gene selection method for breast cancer classification based on GSEA and analyzing differences in performance metrics.GSEA-SDBE:一种基于基因集富集分析(GSEA)并分析性能指标差异的乳腺癌分类基因选择方法。
PLoS One. 2022 Apr 26;17(4):e0263171. doi: 10.1371/journal.pone.0263171. eCollection 2022.
9
Predictions, Pivots, and a Pandemic: a Review of 2020's Top Translational Bioinformatics Publications.预测、转变和大流行:2020 年顶级转化生物信息学出版物综述。
Yearb Med Inform. 2021 Aug;30(1):219-225. doi: 10.1055/s-0041-1726540. Epub 2021 Sep 3.