• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用放大的、最初的边缘特征向量回归从微阵列中预测表型。

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression.

机构信息

Department of Statistics, Indiana University, Bloomington, IN, USA.

出版信息

Bioinformatics. 2017 Jul 15;33(14):i350-i358. doi: 10.1093/bioinformatics/btx265.

DOI:10.1093/bioinformatics/btx265
PMID:28881997
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5870707/
Abstract

MOTIVATION

The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation of new methods which can handle many expression measurements on relatively small numbers of patients while also uncovering gene-gene relationships and predicting well is desirable.

RESULTS

We develop a new technique for using the marginal relationship between gene expression measurements and patient survival outcomes to identify a small subset of genes which appear highly relevant for predicting survival, produce a low-dimensional embedding based on this small subset, and amplify this embedding with information from the remaining genes. We motivate our methodology by using gene expression measurements to predict survival time for patients with diffuse large B-cell lymphoma, illustrate the behavior of our methodology on carefully constructed synthetic examples, and test it on a number of other gene expression datasets. Our technique is computationally tractable, generally outperforms other methods, is extensible to other phenotypes, and also identifies different genes (relative to existing methods) for possible future study.

AVAILABILITY AND IMPLEMENTATION

All of the code and data are available at http://mypage.iu.edu/∼dajmcdon/research/ .

CONTACT

dajmcdon@indiana.edu.

SUPPLEMENTARY INFORMATION

Supplementary material is available at Bioinformatics online.

摘要

动机

基因表达测量值与表型反应之间关系的发现受到计算和统计障碍的阻碍。传统的统计方法并不理想,因为它们要么无法选择相关基因,预测效果不佳,忽略基因之间未知的相互作用结构,要么计算上难以处理。因此,需要创建新的方法,这些方法可以在相对较少的患者中处理大量的表达测量值,同时还可以揭示基因-基因关系并进行良好的预测。

结果

我们开发了一种新的技术,用于利用基因表达测量值与患者生存结果之间的边缘关系来识别一小部分似乎对预测生存非常重要的基因,基于这一小部分生成低维嵌入,并利用其余基因的信息放大该嵌入。我们通过使用基因表达测量值来预测弥漫性大 B 细胞淋巴瘤患者的生存时间来证明我们的方法的合理性,在精心构建的合成示例上说明我们的方法的行为,并在许多其他基因表达数据集上进行测试。我们的技术在计算上是可行的,通常优于其他方法,可扩展到其他表型,并且还可以识别出不同的基因(相对于现有方法),以供未来可能的研究。

可用性和实现

所有的代码和数据都可以在 http://mypage.iu.edu/∼dajmcdon/research/ 上获得。

联系方式

dajmcdon@indiana.edu。

补充信息

补充材料可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/2367c0e51b78/btx265f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/e1e0c01a992c/btx265f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/c3e288392bed/btx265f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/dd6cbb01f89f/btx265f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/6258ae9c973f/btx265f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/2367c0e51b78/btx265f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/e1e0c01a992c/btx265f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/c3e288392bed/btx265f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/dd6cbb01f89f/btx265f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/6258ae9c973f/btx265f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5002/5870707/2367c0e51b78/btx265f5.jpg

相似文献

1
Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression.使用放大的、最初的边缘特征向量回归从微阵列中预测表型。
Bioinformatics. 2017 Jul 15;33(14):i350-i358. doi: 10.1093/bioinformatics/btx265.
2
EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits.EPS-LASSO:连续性状极端表型抽样下的高维回归检验。
Bioinformatics. 2018 Jun 15;34(12):1996-2003. doi: 10.1093/bioinformatics/bty042.
3
Applying meta-analysis to genotype-tissue expression data from multiple tissues to identify eQTLs and increase the number of eGenes.将荟萃分析应用于来自多个组织的基因型-组织表达数据,以识别表达数量性状基因座(eQTL)并增加表达基因(eGenes)的数量。
Bioinformatics. 2017 Jul 15;33(14):i67-i74. doi: 10.1093/bioinformatics/btx227.
4
Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data.通过整合转录调控网络和基因表达数据进行差异调控富集分析。
Bioinformatics. 2015 Feb 15;31(4):563-71. doi: 10.1093/bioinformatics/btu672. Epub 2014 Oct 15.
5
VCNet: vector-based gene co-expression network construction and its application to RNA-seq data.VCNet:基于向量的基因共表达网络构建及其在 RNA-seq 数据中的应用。
Bioinformatics. 2017 Jul 15;33(14):2173-2181. doi: 10.1093/bioinformatics/btx131.
6
Generalized correlation measure using count statistics for gene expression data with ordered samples.基于有序样本的基因表达数据的广义相关度量的计数统计
Bioinformatics. 2018 Feb 15;34(4):617-624. doi: 10.1093/bioinformatics/btx641.
7
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data.HykGene:一种利用微阵列基因表达数据选择用于表型分类的标记基因的混合方法。
Bioinformatics. 2005 Apr 15;21(8):1530-7. doi: 10.1093/bioinformatics/bti192. Epub 2004 Dec 7.
8
Sufficient principal component regression for pattern discovery in transcriptomic data.用于转录组数据模式发现的充分主成分回归
Bioinform Adv. 2022 May 14;2(1):vbac033. doi: 10.1093/bioadv/vbac033. eCollection 2022.
9
A quantum leap in the reproducibility, precision, and sensitivity of gene expression profile analysis even when sample size is extremely small.即使样本量极小,基因表达谱分析的可重复性、精确性和灵敏度也有了巨大飞跃。
J Bioinform Comput Biol. 2015 Aug;13(4):1550018. doi: 10.1142/S0219720015500183. Epub 2015 May 26.
10
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.TimesVector:一种用于分析来自多种表型的时间序列转录组数据的向量化聚类方法。
Bioinformatics. 2017 Dec 1;33(23):3827-3835. doi: 10.1093/bioinformatics/btw780.

引用本文的文献

1
A Comprehensive Review on RNA Subcellular Localization Prediction.RNA亚细胞定位预测综述
ArXiv. 2025 Apr 24:arXiv:2504.17162v1.
2
Sufficient principal component regression for pattern discovery in transcriptomic data.用于转录组数据模式发现的充分主成分回归
Bioinform Adv. 2022 May 14;2(1):vbac033. doi: 10.1093/bioadv/vbac033. eCollection 2022.
3
SMSSVD: SubMatrix Selection Singular Value Decomposition.SMSSVD:子矩阵选择奇异值分解。

本文引用的文献

1
Pharmacogenomic Study of Clozapine-Induced Agranulocytosis/Granulocytopenia in a Japanese Population.氯氮平诱导的日本人群中性粒细胞减少/粒细胞缺乏症的药物基因组学研究。
Biol Psychiatry. 2016 Oct 15;80(8):636-42. doi: 10.1016/j.biopsych.2015.12.006. Epub 2016 Feb 11.
2
Genetic variants associated with motion sickness point to roles for inner ear development, neurological processes and glucose homeostasis.与晕动病相关的基因变异表明内耳发育、神经过程和葡萄糖稳态发挥了作用。
Hum Mol Genet. 2015 May 1;24(9):2700-8. doi: 10.1093/hmg/ddv028. Epub 2015 Jan 26.
3
Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche.
Bioinformatics. 2019 Feb 1;35(3):478-486. doi: 10.1093/bioinformatics/bty566.
母源来源特异性等位基因关联在 106 个基因组位点与初潮年龄相关。
Nature. 2014 Oct 2;514(7520):92-97. doi: 10.1038/nature13545. Epub 2014 Jul 23.
4
Genome-wide analysis of polymorphisms associated with cytokine responses in smallpox vaccine recipients.全基因组分析与天花疫苗接种者细胞因子反应相关的多态性。
Hum Genet. 2012 Sep;131(9):1403-21. doi: 10.1007/s00439-012-1174-2. Epub 2012 May 19.
5
Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies.通过全基因组关联研究的荟萃分析发现了 30 个新的月经初潮年龄相关基因座。
Nat Genet. 2010 Dec;42(12):1077-85. doi: 10.1038/ng.714.
6
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.
7
On Consistency and Sparsity for Principal Components Analysis in High Dimensions.高维主成分分析中的一致性与稀疏性
J Am Stat Assoc. 2009 Jun 1;104(486):682-693. doi: 10.1198/jasa.2009.0121.
8
Parkinson's disease: from monogenic forms to genetic susceptibility factors.帕金森病:从单基因形式到遗传易感性因素
Hum Mol Genet. 2009 Apr 15;18(R1):R48-59. doi: 10.1093/hmg/ddp012.
9
Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease.全基因组关联研究确定了30多个克罗恩病的不同易感基因座。
Nat Genet. 2008 Aug;40(8):955-62. doi: 10.1038/ng.175. Epub 2008 Jun 29.
10
Sparse inverse covariance estimation with the graphical lasso.使用图模型选择法进行稀疏逆协方差估计。
Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.