• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因转变:一种基于内积作为基因分布之间距离度量的非参数方法,用于整合微阵列基因表达数据。

GENESHIFT: a nonparametric approach for integrating microarray gene expression data based on the inner product as a distance measure between the distributions of genes.

机构信息

Vrije Universiteit Brussel, Brussels, Belgium.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):383-92. doi: 10.1109/TCBB.2013.12.

DOI:10.1109/TCBB.2013.12
PMID:23929862
Abstract

The potential of microarray gene expression (MAGE) data is only partially explored due to the limited number of samples in individual studies. This limitation can be surmounted by merging or integrating data sets originating from independent MAGE experiments, which are designed to study the same biological problem. However, this process is hindered by batch effects that are study-dependent and result in random data distortion; therefore numerical transformations are needed to render the integration of different data sets accurate and meaningful. Our contribution in this paper is two-fold. First we propose GENESHIFT, a new nonparametric batch effect removal method based on two key elements from statistics: empirical density estimation and the inner product as a distance measure between two probability density functions; second we introduce a new validation index of batch effect removal methods based on the observation that samples from two independent studies drawn from a same population should exhibit similar probability density functions. We evaluated and compared the GENESHIFT method with four other state-of-the-art methods for batch effect removal: Batch-mean centering, empirical Bayes or COMBAT, distance-weighted discrimination, and cross-platform normalization. Several validation indices providing complementary information about the efficiency of batch effect removal methods have been employed in our validation framework. The results show that none of the methods clearly outperforms the others. More than that, most of the methods used for comparison perform very well with respect to some validation indices while performing very poor with respect to others. GENESHIFT exhibits robust performances and its average rank is the highest among the average ranks of all methods used for comparison.

摘要

由于单个研究中的样本数量有限,微阵列基因表达 (MAGE) 数据的潜力尚未得到充分挖掘。通过合并或整合来自独立 MAGE 实验的数据,可以克服这一限制,这些实验旨在研究相同的生物学问题。然而,这一过程受到批次效应的阻碍,批次效应是依赖于研究的,会导致随机数据扭曲;因此,需要进行数值转换,以使不同数据集的整合准确且有意义。本文的贡献有两点。首先,我们提出了 GENESHIFT,这是一种基于统计学中的两个关键元素的新的非参数批次效应去除方法:经验密度估计和内积作为两个概率密度函数之间的距离度量;其次,我们引入了一种新的批量效应去除方法的验证指标,基于这样一个观察结果:来自同一总体的两个独立研究的样本应该表现出相似的概率密度函数。我们评估并比较了 GENESHIFT 方法与其他四种用于去除批量效应的最先进方法:批量均值中心化、经验贝叶斯或 COMBAT、距离加权判别和跨平台归一化。我们的验证框架采用了多个提供有关批量效应去除方法效率的补充信息的验证指标。结果表明,没有一种方法明显优于其他方法。更重要的是,与比较中使用的大多数方法相比,大多数方法在某些验证指标上表现非常好,而在其他指标上表现非常差。GENESHIFT 表现出稳健的性能,其平均排名在所有比较方法的平均排名中最高。

相似文献

1
GENESHIFT: a nonparametric approach for integrating microarray gene expression data based on the inner product as a distance measure between the distributions of genes.基因转变:一种基于内积作为基因分布之间距离度量的非参数方法,用于整合微阵列基因表达数据。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):383-92. doi: 10.1109/TCBB.2013.12.
2
Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.基于疾病谱数据中错误发现率的七种生成Affymetrix表达分数方法的比较。
BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26.
3
YuGene: a simple approach to scale gene expression data derived from different platforms for integrated analyses.YuGene:一种用于对源自不同平台的基因表达数据进行缩放以进行综合分析的简单方法。
Genomics. 2014 Apr;103(4):239-51. doi: 10.1016/j.ygeno.2014.03.001. Epub 2014 Mar 22.
4
The effects of normalization on the correlation structure of microarray data.标准化对微阵列数据相关结构的影响。
BMC Bioinformatics. 2005 May 16;6:120. doi: 10.1186/1471-2105-6-120.
5
A GMM-IG framework for selecting genes as expression panel biomarkers.一种用于选择基因作为表达谱生物标志物的 GMM-IG 框架。
Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.
6
Quadratic regression analysis for gene discovery and pattern recognition for non-cyclic short time-course microarray experiments.用于非循环短时间进程微阵列实验的基因发现和模式识别的二次回归分析。
BMC Bioinformatics. 2005 Apr 25;6:106. doi: 10.1186/1471-2105-6-106.
7
In silico microdissection of microarray data from heterogeneous cell populations.对来自异质细胞群体的微阵列数据进行计算机模拟显微切割。
BMC Bioinformatics. 2005 Mar 14;6:54. doi: 10.1186/1471-2105-6-54.
8
Mining published lists of cancer related microarray experiments: identification of a gene expression signature having a critical role in cell-cycle control.挖掘已发表的癌症相关微阵列实验列表:鉴定在细胞周期调控中起关键作用的基因表达特征。
BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S14. doi: 10.1186/1471-2105-6-S4-S14.
9
Proximity measures for clustering gene expression microarray data: a validation methodology and a comparative analysis.基因表达微阵列数据聚类的接近度度量:验证方法学和比较分析。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):845-57. doi: 10.1109/TCBB.2013.9.
10
Batch effect removal methods for microarray gene expression data integration: a survey.批量效应去除方法在微阵列基因表达数据整合中的应用:综述。
Brief Bioinform. 2013 Jul;14(4):469-90. doi: 10.1093/bib/bbs037. Epub 2012 Jul 31.

引用本文的文献

1
Data harmonisation for information fusion in digital healthcare: A state-of-the-art systematic review, meta-analysis and future research directions.数字医疗中用于信息融合的数据协调:最新的系统评价、荟萃分析及未来研究方向
Inf Fusion. 2022 Jun;82:99-122. doi: 10.1016/j.inffus.2022.01.001.
2
Development of a Drug-Response Modeling Framework to Identify Cell Line Derived Translational Biomarkers That Can Predict Treatment Outcome to Erlotinib or Sorafenib.开发一种药物反应建模框架,以识别可预测对厄洛替尼或索拉非尼治疗结果的细胞系衍生转化生物标志物。
PLoS One. 2015 Jun 24;10(6):e0130700. doi: 10.1371/journal.pone.0130700. eCollection 2015.
3
Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data.
整合组学分析。一项基于恶性疟原虫mRNA和蛋白质数据的研究。
BMC Syst Biol. 2014;8 Suppl 2(Suppl 2):S4. doi: 10.1186/1752-0509-8-S2-S4. Epub 2014 Mar 13.