文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于相互作用的特征选择,通过拷贝数驱动的表达水平来揭示癌症驱动基因。

Interaction-Based Feature Selection for Uncovering Cancer Driver Genes Through Copy Number-Driven Expression Level.

作者信息

Park Heewon, Niida Atsushi, Imoto Seiya, Miyano Satoru

机构信息

1 Faculty of Global and Science Studies, Yamaguchi University , Yamaguchi Prefecture, Japan .

2 Health Intelligence Center, Institute of Medical Science, University of Tokyo , Tokyo, Japan .

出版信息

J Comput Biol. 2017 Feb;24(2):138-152. doi: 10.1089/cmb.2016.0140. Epub 2016 Oct 19.


DOI:10.1089/cmb.2016.0140
PMID:27759426
Abstract

Driver gene selection is crucial to understand the heterogeneous system of cancer. To identity cancer driver genes, various statistical strategies have been proposed, especially the L-type regularization methods have drawn a large amount of attention. However, the statistical approaches have been developed purely from algorithmic and statistical point, and the existing studies have applied the statistical approaches to genomic data analysis without consideration of biological knowledge. We consider a statistical strategy incorporating biological knowledge to identify cancer driver gene. The alterations of copy number have been considered to driver cancer pathogenesis processes, and the region of strong interaction of copy number alterations and expression levels was known as a tumor-related symptom. We incorporate the influence of copy number alterations on expression levels to cancer driver gene-selection processes. To quantify the dependence of copy number alterations on expression levels, we consider [Formula: see text] and [Formula: see text] effects of copy number alterations on expression levels of genes, and incorporate the symptom of tumor pathogenesis to gene-selection procedures. We then proposed an interaction-based feature-selection strategy based on the adaptive L-type regularization and random lasso procedures. The proposed method imposes a large amount of penalty on genes corresponding to a low dependency of the two features, thus the coefficients of the genes are estimated to be small or exactly 0. It implies that the proposed method can provide biologically relevant results in cancer driver gene selection. Monte Carlo simulations and analysis of the Cancer Genome Atlas (TCGA) data show that the proposed strategy is effective for high-dimensional genomic data analysis. Furthermore, the proposed method provides reliable and biologically relevant results for cancer driver gene selection in TCGA data analysis.

摘要

驱动基因的选择对于理解癌症的异质性系统至关重要。为了识别癌症驱动基因,人们提出了各种统计策略,尤其是L型正则化方法受到了广泛关注。然而,这些统计方法纯粹是从算法和统计角度发展而来的,现有研究在将统计方法应用于基因组数据分析时并未考虑生物学知识。我们考虑一种结合生物学知识的统计策略来识别癌症驱动基因。拷贝数的改变被认为是驱动癌症发病过程的因素,而拷贝数改变与表达水平的强相互作用区域被称为肿瘤相关症状。我们将拷贝数改变对表达水平的影响纳入癌症驱动基因选择过程。为了量化拷贝数改变对表达水平的依赖性,我们考虑拷贝数改变对基因表达水平的[公式:见原文]和[公式:见原文]效应,并将肿瘤发病症状纳入基因选择程序。然后,我们基于自适应L型正则化和随机套索程序提出了一种基于相互作用的特征选择策略。该方法对两个特征依赖性低的基因施加大量惩罚,因此这些基因的系数估计值较小或恰好为0。这意味着该方法在癌症驱动基因选择中能够提供生物学上相关的结果。蒙特卡罗模拟和对癌症基因组图谱(TCGA)数据的分析表明,所提出的策略对于高维基因组数据分析是有效的。此外,在TCGA数据分析中,该方法为癌症驱动基因选择提供了可靠且生物学上相关的结果。

相似文献

[1]
Interaction-Based Feature Selection for Uncovering Cancer Driver Genes Through Copy Number-Driven Expression Level.

J Comput Biol. 2017-2

[2]
Sparse overlapping group lasso for integrative multi-omics analysis.

J Comput Biol. 2015-2

[3]
Identification of candidate cancer drivers by integrative Epi-DNA and Gene Expression (iEDGE) data analysis.

Sci Rep. 2019-11-15

[4]
Recursive Random Lasso (RRLasso) for Identifying Anti-Cancer Drug Targets.

PLoS One. 2015-11-6

[5]
Cross-species DNA copy number analyses identifies multiple 1q21-q23 subtype-specific driver genes for breast cancer.

Breast Cancer Res Treat. 2015-7

[6]
DEOD: uncovering dominant effects of cancer-driver genes based on a partial covariance selection method.

Bioinformatics. 2015-8-1

[7]
ProcessDriver: A computational pipeline to identify copy number drivers and associated disrupted biological processes in cancer.

Genomics. 2017-7

[8]
A Novel Method for Identifying the Potential Cancer Driver Genes Based on Molecular Data Integration.

Biochem Genet. 2019-5-21

[9]
The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes.

Molecules. 2018-1-24

[10]
A Novel Adaptive Penalized Logistic Regression for Uncovering Biomarker Associated with Anti-Cancer Drug Sensitivity.

IEEE/ACM Trans Comput Biol Bioinform. 2017

引用本文的文献

[1]
Dynamic incorporation of prior knowledge from multiple domains in biomarker discovery.

BMC Bioinformatics. 2020-3-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索