• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于文本预测因子的监督式嵌入及其在儿科心脏病学临床诊断中的应用。

Supervised embedding of textual predictors with applications in clinical diagnostics for pediatric cardiology.

机构信息

School of Computational Science & Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA.

出版信息

J Am Med Inform Assoc. 2014 Feb;21(e1):e136-42. doi: 10.1136/amiajnl-2013-001792. Epub 2013 Sep 27.

DOI:10.1136/amiajnl-2013-001792
PMID:24076750
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3957389/
Abstract

OBJECTIVE

Electronic health records possess critical predictive information for machine-learning-based diagnostic aids. However, many traditional machine learning methods fail to simultaneously integrate textual data into the prediction process because of its high dimensionality. In this paper, we present a supervised method using Laplacian Eigenmaps to enable existing machine learning methods to estimate both low-dimensional representations of textual data and accurate predictors based on these low-dimensional representations at the same time.

MATERIALS AND METHODS

We present a supervised Laplacian Eigenmap method to enhance predictive models by embedding textual predictors into a low-dimensional latent space, which preserves the local similarities among textual data in high-dimensional space. The proposed implementation performs alternating optimization using gradient descent. For the evaluation, we applied our method to over 2000 patient records from a large single-center pediatric cardiology practice to predict if patients were diagnosed with cardiac disease. In our experiments, we consider relatively short textual descriptions because of data availability. We compared our method with latent semantic indexing, latent Dirichlet allocation, and local Fisher discriminant analysis. The results were assessed using four metrics: the area under the receiver operating characteristic curve (AUC), Matthews correlation coefficient (MCC), specificity, and sensitivity.

RESULTS AND DISCUSSION

The results indicate that supervised Laplacian Eigenmaps was the highest performing method in our study, achieving 0.782 and 0.374 for AUC and MCC, respectively. Supervised Laplacian Eigenmaps showed an increase of 8.16% in AUC and 20.6% in MCC over the baseline that excluded textual data and a 2.69% and 5.35% increase in AUC and MCC, respectively, over unsupervised Laplacian Eigenmaps.

CONCLUSIONS

As a solution, we present a supervised Laplacian Eigenmap method to embed textual predictors into a low-dimensional Euclidean space. This method allows many existing machine learning predictors to effectively and efficiently capture the potential of textual predictors, especially those based on short texts.

摘要

目的

电子健康记录拥有基于机器学习的诊断辅助工具的关键预测信息。然而,由于其高维性,许多传统的机器学习方法无法同时将文本数据集成到预测过程中。在本文中,我们提出了一种使用拉普拉斯特征映射的有监督方法,使现有的机器学习方法能够同时估计文本数据的低维表示和基于这些低维表示的准确预测器。

材料和方法

我们提出了一种有监督的拉普拉斯特征映射方法,通过将文本预测器嵌入到低维潜在空间中来增强预测模型,该空间保留了高维空间中文本数据的局部相似性。所提出的实现使用梯度下降进行交替优化。为了进行评估,我们将我们的方法应用于来自大型单中心儿科心脏病学实践的 2000 多个患者记录,以预测患者是否患有心脏病。在我们的实验中,由于数据可用性,我们考虑了相对较短的文本描述。我们将我们的方法与潜在语义索引、潜在狄利克雷分配和局部 Fisher 判别分析进行了比较。使用四个指标评估结果:接收器工作特征曲线下的面积(AUC)、马修斯相关系数(MCC)、特异性和敏感性。

结果与讨论

结果表明,在我们的研究中,有监督的拉普拉斯特征映射是表现最好的方法,分别在 AUC 和 MCC 方面达到了 0.782 和 0.374。与排除文本数据的基线相比,有监督的拉普拉斯特征映射在 AUC 方面提高了 8.16%,在 MCC 方面提高了 20.6%,与无监督的拉普拉斯特征映射相比,AUC 和 MCC 分别提高了 2.69%和 5.35%。

结论

作为解决方案,我们提出了一种有监督的拉普拉斯特征映射方法,将文本预测器嵌入到低维欧几里得空间中。该方法允许许多现有的机器学习预测器有效地利用文本预测器的潜力,特别是基于短文本的预测器。

相似文献

1
Supervised embedding of textual predictors with applications in clinical diagnostics for pediatric cardiology.基于文本预测因子的监督式嵌入及其在儿科心脏病学临床诊断中的应用。
J Am Med Inform Assoc. 2014 Feb;21(e1):e136-42. doi: 10.1136/amiajnl-2013-001792. Epub 2013 Sep 27.
2
Exploring nonlinear feature space dimension reduction and data representation in breast Cadx with Laplacian eigenmaps and t-SNE.探讨基于拉普拉斯特征映射和 t-SNE 的乳腺 CADx 非线性特征空间降维和数据表示。
Med Phys. 2010 Jan;37(1):339-51. doi: 10.1118/1.3267037.
3
Machine learning models predicting multidrug resistant urinary tract infections using "DsaaS".使用“DsaaS”预测多重耐药性尿路感染的机器学习模型。
BMC Bioinformatics. 2020 Aug 21;21(Suppl 10):347. doi: 10.1186/s12859-020-03566-7.
4
Dimensionality reduction by supervised neighbor embedding using laplacian search.使用拉普拉斯搜索的监督邻域嵌入降维
Comput Math Methods Med. 2014;2014:594379. doi: 10.1155/2014/594379. Epub 2014 May 21.
5
Structural Laplacian Eigenmaps for modeling sets of multivariate sequences.结构拉普拉斯特征映射在多元序列集建模中的应用。
IEEE Trans Cybern. 2014 Jun;44(6):936-49. doi: 10.1109/TCYB.2013.2277664. Epub 2013 Oct 18.
6
Effective Dimensionality Reduction for Visualizing Neural Dynamics by Laplacian Eigenmaps.通过拉普拉斯特征映射实现神经动力学的有效降维可视化。
Neural Comput. 2019 Jul;31(7):1356-1379. doi: 10.1162/neco_a_01203. Epub 2019 May 21.
7
Consultation length and no-show prediction for improving appointment scheduling efficiency at a cardiology clinic: A data analytics approach.咨询时长和失约预测,以提高心脏病学诊所的预约安排效率:一种数据分析方法。
Int J Med Inform. 2021 Jan;145:104290. doi: 10.1016/j.ijmedinf.2020.104290. Epub 2020 Oct 1.
8
Laplacian linear discriminant analysis approach to unsupervised feature selection.拉普拉斯线性判别分析方法在无监督特征选择中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):605-14. doi: 10.1109/TCBB.2007.70257.
9
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
10
Robust Face Recognition Based on a New Supervised Kernel Subspace Learning Method.基于新监督核子空间学习方法的稳健人脸识别。
Sensors (Basel). 2019 Apr 6;19(7):1643. doi: 10.3390/s19071643.

引用本文的文献

1
Exploring unsupervised feature extraction algorithms: tackling high dimensionality in small datasets.探索无监督特征提取算法:解决小数据集中的高维问题。
Sci Rep. 2025 Jul 1;15(1):21973. doi: 10.1038/s41598-025-07725-9.

本文引用的文献

1
Utility of a clinical support tool for outpatient evaluation of pediatric chest pain.一种用于儿科胸痛门诊评估的临床支持工具的效用
AMIA Annu Symp Proc. 2012;2012:726-33. Epub 2012 Nov 3.
2
Utilization of critical care services among patients undergoing total hip and knee arthroplasty: epidemiology and risk factors.接受全髋关节和全膝关节置换术患者的重症监护服务利用情况:流行病学和危险因素。
Anesthesiology. 2012 Jul;117(1):107-16. doi: 10.1097/ALN.0b013e31825afd36.
3
Small-sample precision of ROC-related estimates.ROC 相关估计的小样本精度。
Bioinformatics. 2010 Mar 15;26(6):822-30. doi: 10.1093/bioinformatics/btq037. Epub 2010 Feb 3.
4
Identification of documented medication non-adherence in physician notes.在医生记录中识别有记录的药物治疗不依从情况。
AMIA Annu Symp Proc. 2008 Nov 6;2008:732-6.
5
A comparison of methods for assessing penetrating trauma on retrospective multi-center data.基于回顾性多中心数据对评估穿透性创伤方法的比较
J Biomed Inform. 2009 Apr;42(2):308-16. doi: 10.1016/j.jbi.2008.09.002. Epub 2008 Oct 1.
6
Evaluation of radiological features for breast tumour classification in clinical screening with machine learning methods.运用机器学习方法评估临床筛查中乳腺肿瘤分类的放射学特征。
Artif Intell Med. 2005 Jun;34(2):129-39. doi: 10.1016/j.artmed.2004.09.001. Epub 2004 Dec 16.
7
Primary care physicians should be coordinators, not gatekeepers.初级保健医生应该是协调者,而不是把关者。
JAMA. 1999 Jun 2;281(21):2045-9. doi: 10.1001/jama.281.21.2045.
8
Analysing and improving the diagnosis of ischaemic heart disease with machine learning.运用机器学习分析并改善缺血性心脏病的诊断
Artif Intell Med. 1999 May;16(1):25-50. doi: 10.1016/s0933-3657(98)00063-3.
9
Usefulness of patient symptoms and nasal endoscopy in the diagnosis of chronic sinusitis.
Am J Rhinol. 1998 May-Jun;12(3):167-71. doi: 10.2500/105065898781390208.
10
Cardiologist versus internist management of patients with unstable angina: treatment patterns and outcomes.心脏病专家与内科医生对不稳定型心绞痛患者的管理:治疗模式与结果
J Am Coll Cardiol. 1995 Sep;26(3):577-82. doi: 10.1016/0735-1097(95)00214-O.