• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学中真实世界高维结构化数据的挖掘及其在决策支持中的应用。未知、相互依存和可区分性的一些不同视角。

Mining real-world high dimensional structured data in medicine and its use in decision support. Some different perspectives on unknowns, interdependency, and distinguishability.

机构信息

Ingine Inc, Ohio, USA; The Dirac Foundation, Oxfordshire, UK.

Ingine Inc, Ohio, USA.

出版信息

Comput Biol Med. 2022 Feb;141:105118. doi: 10.1016/j.compbiomed.2021.105118. Epub 2021 Dec 11.

DOI:10.1016/j.compbiomed.2021.105118
PMID:34971979
Abstract

There are many difficulties in extracting and using knowledge for medical analytic and predictive purposes from Real-World Data, even when the data is already well structured in the manner of a large spreadsheet. Preparative curation and standardization or "normalization" of such data involves a variety of chores but underlying them is an interrelated set of fundamental problems that can in part be dealt with automatically during the datamining and inference processes. These fundamental problems are reviewed here and illustrated and investigated with examples. They concern the treatment of unknowns, the need to avoid independency assumptions, and the appearance of entries that may not be fully distinguished from each other. Unknowns include errors detected as implausible (e.g., out of range) values that are subsequently converted to unknowns. These problems are further impacted by high dimensionality and problems of sparse data that inevitably arise from high-dimensional datamining even if the data is extensive. All these considerations are different aspects of incomplete information, though they also relate to problems that arise if care is not taken to avoid or ameliorate consequences of including the same information twice or more, or if misleading or inconsistent information is combined. This paper addresses these aspects from a slightly different perspective using the Q-UEL language and inference methods based on it by borrowing some ideas from the mathematics of quantum mechanics and information theory. It takes the view that detection and correction of probabilistic elements of knowledge subsequently used in inference need only involve testing and correction so that they satisfy certain extended notions of coherence between probabilities. This is by no means the only possible view, and it is explored here and later compared with a related notion of consistency.

摘要

从真实世界的数据中提取和利用知识进行医学分析和预测存在许多困难,即使数据已经以大型电子表格的方式进行了很好的结构化。这种数据的预处理、规范化或“标准化”涉及各种杂务,但它们的基础是一组相互关联的基本问题,这些问题可以在数据挖掘和推理过程中部分自动处理。本文回顾了这些基本问题,并通过示例进行了说明和研究。它们涉及到对未知值的处理、避免独立性假设的需要,以及可能彼此之间无法完全区分的条目的出现。未知值包括被检测为不合理(例如,超出范围)的值,随后被转换为未知值。这些问题进一步受到高维性和稀疏数据问题的影响,即使数据广泛存在,高维数据挖掘也不可避免地会出现这些问题。所有这些考虑都是不完整信息的不同方面,尽管它们也与如果不注意避免或减轻包含重复信息的后果,或者如果包含误导性或不一致的信息,所产生的问题有关。本文从略微不同的角度使用 Q-UEL 语言和基于它的推理方法来解决这些方面的问题,借鉴了量子力学和信息论数学的一些思想。它认为,随后在推理中使用的知识的概率元素的检测和修正只需要涉及测试和修正,以便它们满足概率之间某些扩展的一致性概念。这绝不是唯一可能的观点,本文对此进行了探讨,并在后面与一致性的相关概念进行了比较。

相似文献

1
Mining real-world high dimensional structured data in medicine and its use in decision support. Some different perspectives on unknowns, interdependency, and distinguishability.医学中真实世界高维结构化数据的挖掘及其在决策支持中的应用。未知、相互依存和可区分性的一些不同视角。
Comput Biol Med. 2022 Feb;141:105118. doi: 10.1016/j.compbiomed.2021.105118. Epub 2021 Dec 11.
2
Bidirectional General Graphs for inference. Principles and implications for medicine.双向通用图推理。医学原理与启示。
Comput Biol Med. 2019 May;108:382-399. doi: 10.1016/j.compbiomed.2019.04.005. Epub 2019 Apr 13.
3
Extension of the Quantum Universal Exchange Language to precision medicine and drug lead discovery. Preliminary example studies using the mitochondrial genome.量子通用交换语言在精准医学和药物先导发现中的扩展。使用线粒体基因组的初步实例研究。
Comput Biol Med. 2020 Feb;117:103621. doi: 10.1016/j.compbiomed.2020.103621. Epub 2020 Jan 20.
4
Implementation of a web based universal exchange and inference language for medicine: Sparse data, probabilities and inference in data mining of clinical data repositories.基于网络的医学通用交换与推理语言的实现:临床数据存储库数据挖掘中的稀疏数据、概率与推理
Comput Biol Med. 2015 Nov 1;66:82-102. doi: 10.1016/j.compbiomed.2015.07.015. Epub 2015 Jul 28.
5
Data-mining to build a knowledge representation store for clinical decision support. Studies on curation and validation based on machine performance in multiple choice medical licensing examinations.数据挖掘以构建用于临床决策支持的知识表示存储库。基于多项选择医学许可考试中的机器性能进行的策展和验证研究。
Comput Biol Med. 2016 Jun 1;73:71-93. doi: 10.1016/j.compbiomed.2016.02.010. Epub 2016 Feb 26.
6
POPPER, a simple programming language for probabilistic semantic inference in medicine.POPPER,一种用于医学概率语义推理的简单编程语言。
Comput Biol Med. 2015 Jan;56:107-23. doi: 10.1016/j.compbiomed.2014.10.011. Epub 2014 Nov 1.
7
Principles of Quantum Mechanics for Artificial Intelligence in medicine. Discussion with reference to the Quantum Universal Exchange Language (Q-UEL).医学人工智能中的量子力学原理。参考量子通用交换语言(Q-UEL)进行讨论。
Comput Biol Med. 2022 Apr;143:105323. doi: 10.1016/j.compbiomed.2022.105323. Epub 2022 Feb 16.
8
Hyperbolic Dirac Nets for medical decision support. Theory, methods, and comparison with Bayes Nets.双曲型狄拉克网络在医疗决策支持中的应用。理论、方法及与贝叶斯网络的比较。
Comput Biol Med. 2014 Aug;51:183-97. doi: 10.1016/j.compbiomed.2014.03.014. Epub 2014 Apr 8.
9
The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics?初出茅庐的医生宛如不知情的量子力学家:采用狄拉克的推理系统是否是个性化医疗、基因组学和蛋白质组学的最佳实践?
J Proteome Res. 2007 Aug;6(8):3114-26. doi: 10.1021/pr070098h. Epub 2007 Jul 3.
10
Studies in the extensively automatic construction of large odds-based inference networks from structured data. Examples from medical, bioinformatics, and health insurance claims data.从结构化数据中广泛自动构建基于大odds 的推理网络的研究。来自医学、生物信息学和健康保险索赔数据的示例。
Comput Biol Med. 2018 Apr 1;95:147-166. doi: 10.1016/j.compbiomed.2018.02.013. Epub 2018 Mar 21.

引用本文的文献

1
Towards faster response against emerging epidemics and prediction of variants of concern.以更快应对新出现的流行病并预测关注的变异株。
Inform Med Unlocked. 2022;31:100966. doi: 10.1016/j.imu.2022.100966. Epub 2022 May 20.