• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
A comparative study of variable selection methods in the context of developing psychiatric screening instruments.在开发精神科筛查工具的背景下,对变量选择方法的比较研究。
Stat Med. 2014 Feb 10;33(3):401-21. doi: 10.1002/sim.5937. Epub 2013 Aug 11.
2
Missing data imputation, prediction, and feature selection in diagnosis of vaginal prolapse.阴道脱垂诊断中的缺失数据插补、预测和特征选择。
BMC Med Res Methodol. 2023 Nov 6;23(1):259. doi: 10.1186/s12874-023-02079-0.
3
Combined Performance of Screening and Variable Selection Methods in Ultra-High Dimensional Data in Predicting Time-To-Event Outcomes.超高维数据中筛选和变量选择方法在预测事件发生时间结局方面的综合性能
Diagn Progn Res. 2018;2. doi: 10.1186/s41512-018-0043-4. Epub 2018 Sep 26.
4
Variable selection for multiply-imputed data with application to dioxin exposure study.具有应用于二恶英暴露研究的多重插补数据的变量选择。
Stat Med. 2013 Sep 20;32(21):3646-59. doi: 10.1002/sim.5783. Epub 2013 Mar 25.
5
Multiple imputation for patient reported outcome measures in randomised controlled trials: advantages and disadvantages of imputing at the item, subscale or composite score level.在随机对照试验中对患者报告结局测量进行多重插补:在项目、分量表或综合评分层面插补的优缺点。
BMC Med Res Methodol. 2018 Aug 28;18(1):87. doi: 10.1186/s12874-018-0542-6.
6
Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions.使用正则化线性回归模型的基因组选择:岭回归、套索回归、弹性网络及其扩展。
BMC Proc. 2012 May 21;6 Suppl 2(Suppl 2):S10. doi: 10.1186/1753-6561-6-S2-S10.
7
The ability of different imputation methods for missing values in mental measurement questionnaires.不同缺失值插补方法在心理测量问卷中的应用能力。
BMC Med Res Methodol. 2020 Feb 27;20(1):42. doi: 10.1186/s12874-020-00932-0.
8
VARIABLE SELECTION AND PREDICTION WITH INCOMPLETE HIGH-DIMENSIONAL DATA.具有不完整高维数据的变量选择与预测
Ann Appl Stat. 2016 Mar;10(1):418-450. doi: 10.1214/15-AOAS899. Epub 2016 Mar 25.
9
A comparison of strategies for selecting auxiliary variables for multiple imputation.辅助变量选择策略在多重插补中的比较。
Biom J. 2024 Jan;66(1):e2200291. doi: 10.1002/bimj.202200291.
10
MissForest--non-parametric missing value imputation for mixed-type data.MissForest--用于混合类型数据的非参数缺失值插补。
Bioinformatics. 2012 Jan 1;28(1):112-8. doi: 10.1093/bioinformatics/btr597. Epub 2011 Oct 28.

引用本文的文献

1
An introduction to Sequential Monte Carlo for Bayesian inference and model comparison-with examples for psychology and behavioral science.贝叶斯推理与模型比较中的序贯蒙特卡罗方法介绍——以心理学和行为科学为例
Behav Res Methods. 2025 Mar 26;57(5):125. doi: 10.3758/s13428-025-02642-1.
2
Fast, smart, and adaptive: using machine learning to optimize mental health assessment and monitor change over time.快速、智能且自适应:利用机器学习优化心理健康评估并监测随时间的变化。
Sci Rep. 2025 Feb 22;15(1):6492. doi: 10.1038/s41598-025-91086-w.
3
An abbreviated Chinese dyslexia screening behavior checklist for primary school students using a machine learning approach.基于机器学习的小学生简体中文阅读障碍筛查行为检查表简本。
Behav Res Methods. 2024 Oct;56(7):7892-7911. doi: 10.3758/s13428-024-02461-w. Epub 2024 Jul 29.
4
Estimating classification consistency of machine learning models for screening measures.估算筛查措施的机器学习模型分类一致性。
Psychol Assess. 2024 Jun-Jul;36(6-7):395-406. doi: 10.1037/pas0001313.
5
Clinical predictors of antipsychotic treatment resistance: Development and internal validation of a prognostic prediction model by the STRATA-G consortium.抗精神病药治疗抵抗的临床预测因素:STRATA-G 联盟开发和内部验证的预后预测模型。
Schizophr Res. 2022 Dec;250:1-9. doi: 10.1016/j.schres.2022.09.009. Epub 2022 Oct 12.
6
Enhancing data pipelines for forecasting student performance: integrating feature selection with cross-validation.增强用于预测学生成绩的数据管道:将特征选择与交叉验证相结合。
Int J Educ Technol High Educ. 2021;18(1):44. doi: 10.1186/s41239-021-00279-6. Epub 2021 Aug 17.
7
Improving the Efficiency of Clinical Trial Recruitment Using an Ensemble Machine Learning to Assist With Eligibility Screening.使用集成机器学习辅助资格筛选提高临床试验招募效率。
ACR Open Rheumatol. 2021 Sep;3(9):593-600. doi: 10.1002/acr2.11289. Epub 2021 Jul 23.
8
Three machine learning algorithms and their utility in exploring risk factors associated with primary cesarean section in low-risk women: A methods paper.三种机器学习算法及其在探索低风险女性原发性剖宫产相关风险因素中的应用:一篇方法学论文。
Res Nurs Health. 2021 Jun;44(3):559-570. doi: 10.1002/nur.22122. Epub 2021 Mar 2.
9
Multiplexed quantitative proteomics provides mechanistic cues for malaria severity and complexity.多重定量蛋白质组学为疟疾的严重程度和复杂性提供了机制线索。
Commun Biol. 2020 Nov 17;3(1):683. doi: 10.1038/s42003-020-01384-4.
10
Evidence-based statistical analysis and methods in biomedical research (SAMBR) checklists according to design features.循证医学统计分析与方法(SAMBR)检查表,根据设计特点。
Cancer Rep (Hoboken). 2020 Aug;3(4):e1211. doi: 10.1002/cnr2.1211. Epub 2019 Aug 22.

本文引用的文献

1
Use of machine learning to shorten observation-based screening and diagnosis of autism.利用机器学习缩短基于观察的自闭症筛查和诊断
Transl Psychiatry. 2012 Apr 10;2(4):e100. doi: 10.1038/tp.2012.10.
2
A Selective Overview of Variable Selection in High Dimensional Feature Space.高维特征空间中变量选择的选择性概述
Stat Sin. 2010 Jan;20(1):101-148.
3
Learning brain connectivity of Alzheimer's disease by sparse inverse covariance estimation.通过稀疏逆协方差估计学习阿尔茨海默病的大脑连接。
Neuroimage. 2010 Apr 15;50(3):935-49. doi: 10.1016/j.neuroimage.2009.12.120. Epub 2010 Jan 14.
4
Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.《超高维特征空间中的确定独立性筛选》讨论
J R Stat Soc Series B Stat Methodol. 2008 Nov;70(5):903. doi: 10.1111/j.1467-9868.2008.00674.x.
5
ROC-based utility function maximization for feature selection and classification with applications to high-dimensional protease data.基于ROC的效用函数最大化用于特征选择和分类及其在高维蛋白酶数据中的应用
Biometrics. 2008 Dec;64(4):1155-61. doi: 10.1111/j.1541-0420.2008.01015.x. Epub 2008 Mar 24.
6
A review of feature selection techniques in bioinformatics.生物信息学中特征选择技术综述。
Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.
7
Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR.使用OSCAR进行预测变量的同时回归收缩、变量选择和监督聚类。
Biometrics. 2008 Mar;64(1):115-23. doi: 10.1111/j.1541-0420.2007.00843.x. Epub 2007 Jun 30.
8
Bias in random forest variable importance measures: illustrations, sources and a solution.随机森林变量重要性度量中的偏差:示例、来源及解决方案
BMC Bioinformatics. 2007 Jan 25;8:25. doi: 10.1186/1471-2105-8-25.
9
A quick and reliable screening measure for OCD in youth: reliability and validity of the obsessive compulsive scale of the Child Behavior Checklist.一种针对青少年强迫症的快速可靠筛查方法:儿童行为清单中强迫症状量表的信效度
Compr Psychiatry. 2006 May-Jun;47(3):234-40. doi: 10.1016/j.comppsych.2005.08.005.
10
Assessing stability of gene selection in microarray data analysis.评估基因芯片数据分析中基因选择的稳定性。
BMC Bioinformatics. 2006 Feb 1;7:50. doi: 10.1186/1471-2105-7-50.

在开发精神科筛查工具的背景下,对变量选择方法的比较研究。

A comparative study of variable selection methods in the context of developing psychiatric screening instruments.

机构信息

Department of Statistics, Columbia University, New York, NY 10027, U.S.A.

出版信息

Stat Med. 2014 Feb 10;33(3):401-21. doi: 10.1002/sim.5937. Epub 2013 Aug 11.

DOI:10.1002/sim.5937
PMID:23934941
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4026268/
Abstract

The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used in such context. Unlike situations where variable selection methods are most commonly applied (e.g., ultra high-dimensional genetic or imaging data), psychiatric data usually have lower dimensions and are characterized by the following factors: correlations and possible interactions among predictors, unobservability of important variables (i.e., true variables not measured by available questionnaires), amount and pattern of missing values in the predictors, and prevalence of cases in the training data. We investigate how these factors affect the performance of several variable selection methods and compare them with respect to selection performance and prediction error rate via simulations. Our results demonstrated that: (1) for complete data, LASSO and Elastic Net outperformed other methods with respect to variable selection and future data prediction, and (2) for certain types of incomplete data, Random Forest induced bias in imputation, leading to incorrect ranking of variable importance. We propose the Imputed-LASSO combining Random Forest imputation and LASSO; this approach offsets the bias in Random Forest and offers a simple yet efficient item selection approach for missing data. As an illustration, we apply the methods to items from the standard Autism Diagnostic Interview-Revised version.

摘要

精神障碍筛查工具的开发涉及从评估临床和行为表型的现有问卷中选择项目池中的项目。筛查工具应仅包含几个项目,并且在对病例和非病例进行分类时具有良好的准确性。可以在这种情况下使用变量/项目选择方法,如最小绝对收缩和选择算子(LASSO)、弹性网络、分类和回归树、随机森林和双样本 t 检验。与变量选择方法最常应用的情况(例如超高维遗传或成像数据)不同,精神障碍数据通常维度较低,具有以下特征:预测器之间的相关性和可能的相互作用、重要变量的不可观测性(即无法通过现有问卷测量的真实变量)、预测器中缺失值的数量和模式以及训练数据中病例的流行率。我们研究了这些因素如何影响几种变量选择方法的性能,并通过模拟比较了它们在选择性能和预测误差率方面的表现。我们的结果表明:(1)对于完整数据,LASSO 和弹性网络在变量选择和未来数据预测方面优于其他方法,(2)对于某些类型的不完整数据,随机森林会导致插补中的偏差,从而导致变量重要性的错误排序。我们提出了结合随机森林插补和 LASSO 的 Imputed-LASSO;这种方法抵消了随机森林的偏差,并为缺失数据提供了一种简单而有效的项目选择方法。作为说明,我们将这些方法应用于标准自闭症诊断访谈修订版的项目。