• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

估算基因表达以最大化平台兼容性。

Imputing gene expression to maximize platform compatibility.

作者信息

Zhou Weizhuang, Han Lichy, Altman Russ B

机构信息

Department of Bioengineering.

Biomedical Informatics Training Program.

出版信息

Bioinformatics. 2017 Feb 15;33(4):522-528. doi: 10.1093/bioinformatics/btw664.

DOI:10.1093/bioinformatics/btw664
PMID:27797771
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5408923/
Abstract

UNLABELLED

Microarray measurements of gene expression constitute a large fraction of publicly shared biological data, and are available in the Gene Expression Omnibus (GEO). Many studies use GEO data to shape hypotheses and improve statistical power. Within GEO, the Affymetrix HG-U133A and HG-U133 Plus 2.0 are the two most commonly used microarray platforms for human samples; the HG-U133 Plus 2.0 platform contains 54 220 probes and the HG-U133A array contains a proper subset (21 722 probes). When different platforms are involved, the subset of common genes is most easily compared. This approach results in the exclusion of substantial measured data and can limit downstream analysis. To predict the expression values for the genes unique to the HG-U133 Plus 2.0 platform, we constructed a series of gene expression inference models based on genes common to both platforms. Our model predicts gene expression values that are within the variability observed in controlled replicate studies and are highly correlated with measured data. Using six previously published studies, we also demonstrate the improved performance of the enlarged feature space generated by our model in downstream analysis.

AVAILABILITY AND IMPLEMENTATION

The gene inference model described in this paper is available as a R package (affyImpute), which can be downloaded at http://simtk.org/home/affyimpute.

CONTACT

rbaltman@stanford.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

未标注

基因表达的微阵列测量构成了公开共享生物数据的很大一部分,并且可在基因表达综合数据库(GEO)中获取。许多研究使用GEO数据来构建假设并提高统计效力。在GEO中,Affymetrix HG-U133A和HG-U133 Plus 2.0是用于人类样本的两个最常用的微阵列平台;HG-U133 Plus 2.0平台包含54220个探针,HG-U133A阵列包含一个适当的子集(21722个探针)。当涉及不同平台时,最容易比较共同基因的子集。这种方法会导致排除大量实测数据,并可能限制下游分析。为了预测HG-U133 Plus 2.0平台特有的基因的表达值,我们基于两个平台共有的基因构建了一系列基因表达推断模型。我们的模型预测的基因表达值在对照重复研究中观察到的变异性范围内,并且与实测数据高度相关。使用六项先前发表的研究,我们还证明了我们的模型生成的扩大特征空间在下游分析中的性能提升。

可用性和实现方式

本文中描述的基因推断模型作为一个R包(affyImpute)可用,可从http://simtk.org/home/affyimpute下载。

联系方式

rbaltman@stanford.edu。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/1787ffdb16ff/btw664f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/3bae19b27897/btw664f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/d523929b280b/btw664f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/596dc0deee81/btw664f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/c04858506675/btw664f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/f0dc260807ba/btw664f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/1787ffdb16ff/btw664f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/3bae19b27897/btw664f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/d523929b280b/btw664f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/596dc0deee81/btw664f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/c04858506675/btw664f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/f0dc260807ba/btw664f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9172/5408923/1787ffdb16ff/btw664f6.jpg

相似文献

1
Imputing gene expression to maximize platform compatibility.估算基因表达以最大化平台兼容性。
Bioinformatics. 2017 Feb 15;33(4):522-528. doi: 10.1093/bioinformatics/btw664.
2
Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements.通过与cDNA微阵列探针的序列重叠来重新定义Affymetrix探针集,可减少癌症相关基因表达测量中跨平台的不一致性。
BMC Bioinformatics. 2005 Apr 25;6:107. doi: 10.1186/1471-2105-6-107.
3
Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest.基于功能感兴趣区域的公开可用 Affymetrix® GeneChip® 数据集再分析框架。
BMC Genomics. 2017 Dec 6;18(Suppl 10):875. doi: 10.1186/s12864-017-4266-5.
4
Probe mapping across multiple microarray platforms.在多个微阵列平台上进行探针映射。
Brief Bioinform. 2012 Sep;13(5):547-54. doi: 10.1093/bib/bbr076. Epub 2011 Dec 23.
5
In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip technology.利用基因芯片技术进行种间序列相似性的体外鉴定及计算机模拟应用。
BMC Genomics. 2005 May 4;6:62. doi: 10.1186/1471-2164-6-62.
6
Implementing an online tool for genome-wide validation of survival-associated biomarkers in ovarian-cancer using microarray data from 1287 patients.利用 1287 名患者的基因芯片数据,开发一种在线工具,对卵巢癌中与生存相关的生物标志物进行全基因组验证。
Endocr Relat Cancer. 2012 Apr 10;19(2):197-208. doi: 10.1530/ERC-11-0329. Print 2012 Apr.
7
Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations.商业短寡核苷酸微阵列的性能评估以及噪声对跨平台相关性的影响。
BMC Genomics. 2004 Sep 2;5:61. doi: 10.1186/1471-2164-5-61.
8
Evaluation of the similarity of gene expression data estimated with SAGE and Affymetrix GeneChips.用SAGE和Affymetrix基因芯片评估基因表达数据的相似性。
BMC Genomics. 2005 Jun 14;6:91. doi: 10.1186/1471-2164-6-91.
9
aRrayLasso: a network-based approach to microarray interconversion.阵列套索:一种基于网络的微阵列相互转换方法。
Bioinformatics. 2015 Dec 1;31(23):3859-61. doi: 10.1093/bioinformatics/btv469. Epub 2015 Aug 17.
10
compendiumdb: an R package for retrieval and storage of functional genomics data.compendiumdb:一个用于检索和存储功能基因组学数据的R软件包。
Bioinformatics. 2016 Sep 15;32(18):2856-7. doi: 10.1093/bioinformatics/btw335. Epub 2016 Jun 9.

引用本文的文献

1
Bioinformatics analysis of PSAT1 loss identifies downstream pathways regulated in EGFR mutant NSCLC and a selective gene signature for predicting the risk of relapse.PSAT1缺失的生物信息学分析确定了EGFR突变型非小细胞肺癌中受调控的下游通路以及用于预测复发风险的选择性基因特征。
Oncol Lett. 2024 Oct 17;29(1):9. doi: 10.3892/ol.2024.14755. eCollection 2025 Jan.
2
TidyGEO: preparing analysis-ready datasets from Gene Expression Omnibus.TidyGEO:从基因表达综合数据库准备分析就绪数据集。
J Integr Bioinform. 2023 Dec 5;21(1). doi: 10.1515/jib-2023-0021. eCollection 2024 Mar 1.
3
A pairwise strategy for imputing predictive features when combining multiple datasets.

本文引用的文献

1
Data Sharing.数据共享。
N Engl J Med. 2016 Jan 21;374(3):276-7. doi: 10.1056/NEJMe1516564.
2
Sources of high variance between probe signals in Affymetrix short oligonucleotide microarrays.在 Affymetrix 短寡核苷酸微阵列中,探针信号之间的高方差来源。
Sensors (Basel). 2013 Dec 31;14(1):532-48. doi: 10.3390/s140100532.
3
Inconsistency in large pharmacogenomic studies.大型药物基因组学研究中的不一致性。
当组合多个数据集时,用于推断预测特征的成对策略。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac839.
4
A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.一种灵活、可解释且准确的方法,用于推断未测量基因的表达。
Nucleic Acids Res. 2020 Dec 2;48(21):e125. doi: 10.1093/nar/gkaa881.
5
A Toolbox for Functional Analysis and the Systematic Identification of Diagnostic and Prognostic Gene Expression Signatures Combining Meta-Analysis and Machine Learning.一个用于功能分析以及结合荟萃分析和机器学习对诊断和预后基因表达特征进行系统鉴定的工具箱。
Cancers (Basel). 2019 Oct 21;11(10):1606. doi: 10.3390/cancers11101606.
6
Decentralized Learning Framework of Meta-Survival Analysis for Developing Robust Prognostic Signatures.用于开发稳健预后特征的元生存分析的分散学习框架
JCO Clin Cancer Inform. 2017 Nov;1:1-13. doi: 10.1200/CCI.17.00077.
7
Prognostic Characteristics of MACC1 Expression in Epithelial Ovarian Cancer.MACC1 表达在卵巢上皮性癌中的预后特征。
Biomed Res Int. 2018 Nov 1;2018:9207153. doi: 10.1155/2018/9207153. eCollection 2018.
8
A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation.一种使用高通量毒理基因组学数据和基于通路的验证进行全基因组预测的定性建模方法。
Front Pharmacol. 2018 Oct 2;9:1072. doi: 10.3389/fphar.2018.01072. eCollection 2018.
9
Data-driven human transcriptomic modules determined by independent component analysis.基于独立成分分析的人类转录组模块的数据分析。
BMC Bioinformatics. 2018 Sep 17;19(1):327. doi: 10.1186/s12859-018-2338-4.
10
A probabilistic pathway score (PROPS) for classification with applications to inflammatory bowel disease.用于分类的概率途径评分(PROPS)及其在炎症性肠病中的应用。
Bioinformatics. 2018 Mar 15;34(6):985-993. doi: 10.1093/bioinformatics/btx651.
Nature. 2013 Dec 19;504(7480):389-93. doi: 10.1038/nature12831. Epub 2013 Nov 27.
4
curatedOvarianData: clinically annotated data for the ovarian cancer transcriptome.已策展的卵巢数据:卵巢癌转录组的临床注释数据。
Database (Oxford). 2013 Apr 2;2013:bat013. doi: 10.1093/database/bat013. Print 2013.
5
Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells.癌症药物敏感性基因组学(GDSC):癌症细胞治疗生物标志物发现的资源。
Nucleic Acids Res. 2013 Jan;41(Database issue):D955-61. doi: 10.1093/nar/gks1111. Epub 2012 Nov 23.
6
The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity.癌症细胞系百科全书使对抗癌药物敏感性的预测建模成为可能。
Nature. 2012 Mar 28;483(7391):603-7. doi: 10.1038/nature11003.
7
Jetset: selecting the optimal microarray probe set to represent a gene.微阵列探针集的选择:代表一个基因的最优微阵列探针集。
BMC Bioinformatics. 2011 Dec 15;12:474. doi: 10.1186/1471-2105-12-474.
8
Missing value imputation for gene expression data: computational techniques to recover missing data from available information.基因表达数据的缺失值填补:从现有信息中恢复缺失数据的计算技术。
Brief Bioinform. 2011 Sep;12(5):498-513. doi: 10.1093/bib/bbq080. Epub 2010 Dec 14.
9
Evaluation of gene expression data generated from expired Affymetrix GeneChip® microarrays using MAQC reference RNA samples.利用 MAQC 参考 RNA 样本评估过期的 Affymetrix GeneChip® 微阵列生成的基因表达数据。
BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S10. doi: 10.1186/1471-2105-11-S6-S10.
10
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.