• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
An empirical assessment of validation practices for molecular classifiers.分子分类器验证实践的实证评估。
Brief Bioinform. 2011 May;12(3):189-202. doi: 10.1093/bib/bbq073. Epub 2011 Feb 7.
2
Comparison of feature selection and classification for MALDI-MS data.基质辅助激光解吸电离飞行时间质谱(MALDI-MS)数据的特征选择与分类比较
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-10-S1-S3.
3
Bias in error estimation when using cross-validation for model selection.在使用交叉验证进行模型选择时误差估计中的偏差。
BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.
4
Reviewing ensemble classification methods in breast cancer.综述乳腺癌中的集成分类方法。
Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20.
5
Comparison of multivariate classifiers and response normalizations for pattern-information fMRI.基于模式信息的 fMRI 的多变量分类器和响应归一化方法比较。
Neuroimage. 2010 Oct 15;53(1):103-18. doi: 10.1016/j.neuroimage.2010.05.051. Epub 2010 May 23.
6
A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard.一种在缺乏黄金标准的情况下比较和组合二进制分类器的贝叶斯方法。
BMC Bioinformatics. 2012 Jul 27;13:179. doi: 10.1186/1471-2105-13-179.
7
An intensity-region driven multi-classifier scheme for improving the classification accuracy of proteomic MS-spectra.一种基于强度区域的多分类器方案,用于提高蛋白质组 MS 谱的分类准确性。
Comput Methods Programs Biomed. 2010 Aug;99(2):147-53. doi: 10.1016/j.cmpb.2009.11.003. Epub 2009 Dec 9.
8
Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。
Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.
9
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
10
Feature selection and nearest centroid classification for protein mass spectrometry.蛋白质质谱的特征选择与最近质心分类
BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68.

引用本文的文献

1
Prediction of Deleterious Single Amino Acid Polymorphisms with a Consensus Holdout Sampler.使用一致性留出采样器预测有害单氨基酸多态性
Curr Genomics. 2024 May 31;25(3):171-184. doi: 10.2174/0113892029236347240308054538. Epub 2024 Mar 14.
2
Blood-based biomarkers and novel technologies for the diagnosis of colorectal cancer and adenomas: a narrative review.用于结直肠癌和腺瘤诊断的基于血液的生物标志物和新技术:叙述性综述。
Biomark Med. 2024;18(9):493-506. doi: 10.1080/17520363.2024.2345583. Epub 2024 Jun 20.
3
Computational Tools to Assist in Analyzing Effects of the Gene Variation on Alpha-1 Antitrypsin (AAT).用于分析基因变异对 α-1 抗胰蛋白酶(AAT)影响的计算工具。
Genes (Basel). 2024 Mar 6;15(3):340. doi: 10.3390/genes15030340.
4
PREDICTION OF HEREDITARY CANCERS USING NEURAL NETWORKS.使用神经网络预测遗传性癌症
Ann Appl Stat. 2022 Mar;16(1):495-520. doi: 10.1214/21-aoas1510. Epub 2022 Mar 28.
5
Multilingual RECIST classification of radiology reports using supervised learning.使用监督学习对放射学报告进行多语言RECIST分类。
Front Digit Health. 2023 Jun 14;5:1195017. doi: 10.3389/fdgth.2023.1195017. eCollection 2023.
6
Development and validation of a dynamic 48-hour in-hospital mortality risk stratification for COVID-19 in a UK teaching hospital: a retrospective cohort study.在英国教学医院中开发和验证一种针对 COVID-19 的动态 48 小时院内死亡风险分层模型:一项回顾性队列研究。
BMJ Open. 2022 Sep 5;12(9):e060026. doi: 10.1136/bmjopen-2021-060026.
7
Addressing Noise and Estimating Uncertainty in Biomedical Data through the Exploration of Chemical Space.通过探索化学空间解决生物医学数据中的噪声和估计不确定性。
Int J Mol Sci. 2022 Oct 26;23(21):12975. doi: 10.3390/ijms232112975.
8
Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking.能够获取任意规模的真实数据使得模拟数据对于生物信息学方法的开发和基准测试而言,与实验数据一样不可或缺。
Bioinformatics. 2022 Oct 31;38(21):4994-4996. doi: 10.1093/bioinformatics/btac612.
9
Generalizing predictions to unseen sequencing profiles via deep generative models.通过深度生成模型将预测推广到未见的测序谱。
Sci Rep. 2022 May 3;12(1):7151. doi: 10.1038/s41598-022-11363-w.
10
Clinical prediction models in psychiatry: a systematic review of two decades of progress and challenges.精神病学中的临床预测模型:二十年进展与挑战的系统回顾。
Mol Psychiatry. 2022 Jun;27(6):2700-2708. doi: 10.1038/s41380-022-01528-4. Epub 2022 Apr 1.

本文引用的文献

1
Over-optimism in bioinformatics: an illustration.生物信息学中的过度乐观:一个例证。
Bioinformatics. 2010 Aug 15;26(16):1990-8. doi: 10.1093/bioinformatics/btq323. Epub 2010 Jun 26.
2
Expectations, validity, and reality in omics.组学中的期望、有效性和现实。
J Clin Epidemiol. 2010 Sep;63(9):945-9. doi: 10.1016/j.jclinepi.2010.04.002. Epub 2010 Jun 22.
3
Accurate prediction of repeat prostate biopsy outcomes by a mitochondrial DNA deletion assay.线粒体 DNA 缺失检测可准确预测前列腺重复活检结果。
Prostate Cancer Prostatic Dis. 2010 Jun;13(2):126-31. doi: 10.1038/pcan.2009.64. Epub 2010 Jan 19.
4
Prospective comparison of clinical and genomic multivariate predictors of response to neoadjuvant chemotherapy in breast cancer.前瞻性比较临床和基因组多变量预测因子对乳腺癌新辅助化疗的反应。
Clin Cancer Res. 2010 Jan 15;16(2):711-8. doi: 10.1158/1078-0432.CCR-09-2247. Epub 2010 Jan 12.
5
Optimal classifier selection and negative bias in error rate estimation: an empirical study on high-dimensional prediction.最优分类器选择和误差率估计中的负偏差:高维预测中的实证研究。
BMC Med Res Methodol. 2009 Dec 21;9:85. doi: 10.1186/1471-2288-9-85.
6
Gene expression profiles in peripheral blood mononuclear cells can distinguish patients with non-small cell lung cancer from patients with nonmalignant lung disease.外周血单个核细胞中的基因表达谱可区分非小细胞肺癌患者与非恶性肺部疾病患者。
Cancer Res. 2009 Dec 15;69(24):9202-10. doi: 10.1158/0008-5472.CAN-09-1378.
7
An eight-gene blood expression profile predicts the response to infliximab in rheumatoid arthritis.一种八基因血液表达谱可预测类风湿关节炎对英夫利昔单抗的应答。
PLoS One. 2009 Oct 22;4(10):e7556. doi: 10.1371/journal.pone.0007556.
8
Reporting bias when using real data sets to analyze classification performance.使用真实数据集分析分类性能时的报告偏倚。
Bioinformatics. 2010 Jan 1;26(1):68-76. doi: 10.1093/bioinformatics/btp605. Epub 2009 Oct 21.
9
Serum biomarkers of vascular cognitive impairment evaluated by bead-based proteomic technology.基于微珠的蛋白质组学技术评估血管性认知障碍的血清生物标志物
Neurosci Lett. 2009 Sep 29;463(1):6-11. doi: 10.1016/j.neulet.2009.07.056. Epub 2009 Jul 23.
10
Identification of two new serum protein profiles for renal cell carcinoma.肾细胞癌两种新血清蛋白谱的鉴定。
Oncol Rep. 2009 Aug;22(2):401-8.

分子分类器验证实践的实证评估。

An empirical assessment of validation practices for molecular classifiers.

机构信息

Institute for Clinical Research and Health Policy Studies at Tufts Medical Center, USA.

出版信息

Brief Bioinform. 2011 May;12(3):189-202. doi: 10.1093/bib/bbq073. Epub 2011 Feb 7.

DOI:10.1093/bib/bbq073
PMID:21300697
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3088312/
Abstract

Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21-61%) and 29% (IQR, 15-65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04-5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice.

摘要

提出的分子分类器可能过度拟合于嘈杂的基因组和蛋白质组数据的特征。交叉验证方法通常用于获得分类准确性的估计,但模拟和案例研究都表明,当使用不适当的方法时,可能会出现偏差。通过外部(独立)验证可以避免偏差并测试可推广性。我们评估了 35 项报告分子分类器外部验证的研究。我们提取了关于研究设计和方法特征的信息,并比较了 28 项同时进行内部交叉验证和外部验证的研究中分子分类器的性能。我们证明,大多数研究采用的交叉验证实践很可能高估了分类器的性能。大多数研究在内部交叉验证和外部验证之间检测灵敏度或特异性降低 20%的能力明显不足[中位数效能分别为 36%(IQR,21%-61%)和 29%(IQR,15%-65%)]。报告的分类性能中位数为灵敏度和特异性分别为 94%和 98%,交叉验证和 88%和 81%,独立验证。交叉验证与独立验证的相对诊断比值比为 3.26(95%CI 2.04-5.21)。最后,我们回顾了所有引用我们研究样本中研究的(n=758)文献,并仅发现一次对这些分类器进行额外独立验证的实例。总之,这些结果表明,文献中使用的许多交叉验证实践可能存在偏差,而该领域的真正进展将需要采用分子分类器的常规外部验证,最好是在比当前实践更大的研究中进行。