• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AUCTSP:一种改进的生物标志物基因对分类预测器。

AUCTSP: an improved biomarker gene pair class predictor.

机构信息

Department of Electrical and Computer Engineering, Southern Illinois University, 1230 Lincoln Drive, Carbondale, 62901, IL, USA.

Department of Biostatistics, Indiana University School of Public Health, 410 West 10th Street, Suite 3000, Indianapolis, 46202, IN, USA.

出版信息

BMC Bioinformatics. 2018 Jun 26;19(1):244. doi: 10.1186/s12859-018-2231-1.

DOI:10.1186/s12859-018-2231-1
PMID:29940833
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6020231/
Abstract

BACKGROUND

The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea that differences in gene expression ranking are associated with presence or absence of disease is compelling and has strong biological plausibility. Nevertheless, the TSP formulation ignores significant available information which can improve classification accuracy and is vulnerable to selecting genes which do not have differential expression in the two conditions ("pivot" genes).

RESULTS

We introduce the AUCTSP classifier as an alternative rank-based estimator of the magnitude of the ranking reversals involved in the original TSP. The proposed estimator is based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) and as such, takes into account the separation of the entire distribution of gene expression levels in gene pairs under the conditions considered, as opposed to comparing gene rankings within individual subjects as in the original TSP formulation. Through extensive simulations and case studies involving classification in ovarian, leukemia, colon, breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative (pivot) genes.

CONCLUSIONS

The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across all subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.

摘要

背景

基于基因对表达中相对排序反转的概念,提出了 Top Scoring Pair(TSP)分类器,作为一种简单、准确且易于解释的决策规则,用于基因表达谱的分类和类别预测。基因表达排序差异与疾病的存在与否相关的观点令人信服,具有很强的生物学合理性。然而,TSP 公式忽略了可以提高分类准确性的重要可用信息,并且容易选择在两种情况下没有差异表达的基因(“枢轴”基因)。

结果

我们引入了 AUCTSP 分类器作为原始 TSP 中涉及的排序反转幅度的替代基于排序的估计量。所提出的估计量基于接收器操作特征曲线(ROC)下的面积(AUC),因此考虑了所考虑条件下基因对中整个基因表达水平分布的分离,而不是像原始 TSP 公式那样比较单个受试者内的基因排序。通过涉及卵巢癌、白血病、结肠癌、乳腺癌和前列腺癌以及弥漫性大 B 细胞淋巴瘤的分类的广泛模拟和案例研究,我们表明,在所提出的方法中,在提高分类准确性、避免过拟合和减少选择非信息性(枢轴)基因方面具有优越性。

结论

所提出的 AUCTSP 是一种简单但可靠且稳健的基于排序的基因表达分类器。虽然 AUCTSP 的工作原理与 TSP 相同,但它能够根据两个标记基因在所有受试者中的相对排序而不是每个个体受试者来确定最佳评分基因对,从而在分类准确性方面取得显著的性能提升。此外,所提出的方法倾向于避免选择非信息性(枢轴)基因作为最佳评分对的成员。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/ebbb02c0fc8c/12859_2018_2231_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/a11d82b8b191/12859_2018_2231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/61289171e918/12859_2018_2231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/706ca5460bb6/12859_2018_2231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/c6095af2336e/12859_2018_2231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/498e4e74a3e2/12859_2018_2231_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/9901bf8f83ee/12859_2018_2231_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/29dca7be542e/12859_2018_2231_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/ebbb02c0fc8c/12859_2018_2231_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/a11d82b8b191/12859_2018_2231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/61289171e918/12859_2018_2231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/706ca5460bb6/12859_2018_2231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/c6095af2336e/12859_2018_2231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/498e4e74a3e2/12859_2018_2231_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/9901bf8f83ee/12859_2018_2231_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/29dca7be542e/12859_2018_2231_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a3/6020231/ebbb02c0fc8c/12859_2018_2231_Fig8_HTML.jpg

相似文献

1
AUCTSP: an improved biomarker gene pair class predictor.AUCTSP:一种改进的生物标志物基因对分类预测器。
BMC Bioinformatics. 2018 Jun 26;19(1):244. doi: 10.1186/s12859-018-2231-1.
2
Speeding up the discovery of combinations of differentially expressed genes for disease prediction and classification.加速发现差异表达基因组合,用于疾病预测和分类。
Comput Methods Programs Biomed. 2019 Mar;170:69-80. doi: 10.1016/j.cmpb.2019.01.004. Epub 2019 Jan 12.
3
TSG: a new algorithm for binary and multi-class cancer classification and informative genes selection.TSG:一种用于二分类和多分类癌症分类及信息基因选择的新算法。
BMC Med Genomics. 2013;6 Suppl 1(Suppl 1):S3. doi: 10.1186/1755-8794-6-S1-S3. Epub 2013 Jan 23.
4
Tumor classification ranking from microarray data.基于微阵列数据的肿瘤分类排名
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.
5
A generalized covariate-adjusted top-scoring pair algorithm with applications to diabetic kidney disease stage classification in the Chronic Renal Insufficiency Cohort (CRIC) Study.一种广义协变量调整的最优配对算法及其在慢性肾功能不全队列研究(CRIC)中糖尿病肾病分期分类中的应用。
BMC Bioinformatics. 2023 Feb 20;24(1):57. doi: 10.1186/s12859-023-05171-w.
6
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.用于癌症微阵列数据分类的分层基因选择与遗传模糊系统
PLoS One. 2015 Mar 30;10(3):e0120364. doi: 10.1371/journal.pone.0120364. eCollection 2015.
7
Simple decision rules for classifying human cancers from gene expression profiles.基于基因表达谱对人类癌症进行分类的简单决策规则。
Bioinformatics. 2005 Oct 15;21(20):3896-904. doi: 10.1093/bioinformatics/bti631. Epub 2005 Aug 16.
8
Considerations for feature selection using gene pairs and applications in large-scale dataset integration, novel oncogene discovery, and interpretable cancer screening.考虑使用基因对进行特征选择,并将其应用于大规模数据集整合、新癌基因发现和可解释性癌症筛查。
BMC Med Genomics. 2020 Oct 22;13(Suppl 10):148. doi: 10.1186/s12920-020-00778-x.
9
An ensemble of SVM classifiers based on gene pairs.基于基因对的 SVM 分类器集成。
Comput Biol Med. 2013 Jul;43(6):729-37. doi: 10.1016/j.compbiomed.2013.03.010. Epub 2013 Mar 30.
10
Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction.机器学习中特征选择的最佳评分对及其在癌症预后预测中的应用。
BMC Bioinformatics. 2011 Sep 23;12:375. doi: 10.1186/1471-2105-12-375.

引用本文的文献

1
Data analysis methods for defining biomarkers from omics data.用于从组学数据中定义生物标志物的数据分析方法。
Anal Bioanal Chem. 2022 Jan;414(1):235-250. doi: 10.1007/s00216-021-03813-7. Epub 2021 Dec 24.
2
Modeling Between-Study Heterogeneity for Improved Replicability in Gene Signature Selection and Clinical Prediction.为提高基因特征选择和临床预测中的可重复性对研究间异质性进行建模
J Am Stat Assoc. 2020;115(531):1125-1138. doi: 10.1080/01621459.2019.1671197. Epub 2019 Oct 29.
3
A qualitative transcriptional signature for predicting microsatellite instability status of right-sided Colon Cancer.

本文引用的文献

1
VarElect: the phenotype-based variation prioritizer of the GeneCards Suite.VarElect:基因卡片套件中基于表型的变异优先级排序工具。
BMC Genomics. 2016 Jun 23;17 Suppl 2(Suppl 2):444. doi: 10.1186/s12864-016-2722-2.
2
Comparison of gene expression patterns across 12 tumor types identifies a cancer supercluster characterized by TP53 mutations and cell cycle defects.对12种肿瘤类型的基因表达模式进行比较,确定了一个以TP53突变和细胞周期缺陷为特征的癌症超级集群。
Oncogene. 2015 May 21;34(21):2732-40. doi: 10.1038/onc.2014.216. Epub 2014 Aug 4.
3
Decision tree and ensemble learning algorithms with their applications in bioinformatics.
用于预测右半结肠癌微卫星不稳定性状态的转录特征的定性分析
BMC Genomics. 2019 Oct 23;20(1):769. doi: 10.1186/s12864-019-6129-8.
决策树和集成学习算法及其在生物信息学中的应用。
Adv Exp Med Biol. 2011;696:191-9. doi: 10.1007/978-1-4419-7046-6_19.
4
Improving cancer classification accuracy using gene pairs.利用基因对提高癌症分类准确性。
PLoS One. 2010 Dec 21;5(12):e14305. doi: 10.1371/journal.pone.0014305.
5
Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.mRNA-Seq 实验中标准化和差异表达的统计方法评估。
BMC Bioinformatics. 2010 Feb 18;11:94. doi: 10.1186/1471-2105-11-94.
6
The ordering of expression among a few genes can provide simple cancer biomarkers and signal BRCA1 mutations.少数基因表达的顺序可以提供简单的癌症生物标志物,并提示 BRCA1 突变。
BMC Bioinformatics. 2009 Aug 20;10:256. doi: 10.1186/1471-2105-10-256.
7
Modelling breast cancer: one size does not fit all.乳腺癌建模:一刀切并不适用。
Nat Rev Cancer. 2007 Sep;7(9):659-72. doi: 10.1038/nrc2193.
8
Classifying gene expression profiles from pairwise mRNA comparisons.通过成对的mRNA比较对基因表达谱进行分类。
Stat Appl Genet Mol Biol. 2004;3:Article19. doi: 10.2202/1544-6115.1071. Epub 2004 Aug 30.
9
Simple decision rules for classifying human cancers from gene expression profiles.基于基因表达谱对人类癌症进行分类的简单决策规则。
Bioinformatics. 2005 Oct 15;21(20):3896-904. doi: 10.1093/bioinformatics/bti631. Epub 2005 Aug 16.
10
Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments.排序产品:一种简单却强大的新方法,用于在重复微阵列实验中检测差异调节基因。
FEBS Lett. 2004 Aug 27;573(1-3):83-92. doi: 10.1016/j.febslet.2004.07.055.