• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大数据中标准化医疗保健文档的语义保留。

Semantic preservation of standardized healthcare documents in big data.

机构信息

Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea; Department of Computer Science, National University of Computer and Emerging Science, Islamabad, Pakistan.

Department of Software, Sejong University, South Korea.

出版信息

Int J Med Inform. 2019 Sep;129:133-145. doi: 10.1016/j.ijmedinf.2019.05.024. Epub 2019 Jun 7.

DOI:10.1016/j.ijmedinf.2019.05.024
PMID:31445248
Abstract

BACKGROUND

Standardized healthcare documents have a high adoption rate in today's hospital setup. This brings several challenges as processing the documents on a large scale takes a toll on the infrastructure. The complexity of these documents compounds the issue of handling them which is why applying big data techniques is necessary. The nature of big data techniques can trigger accuracy/semantic loss in health documents when they are partitioned for processing. This semantic loss is critical with respect to clinical use as well as insurance, or medical education.

METHODS

In this paper we propose a novel technique to avoid any semantic loss that happens during the conventional partitioning of healthcare documents in big data through a constraint model based on the conformance of clinical document standard and user based use cases. We used clinical document architecture (CDAR) datasets on Hadoop Distributed File System (HDFS) through uniquely configured setup. We identified the affected documents with respect to semantic loss after partitioning and separated them into two sets: conflict free documents and conflicted documents. The resolution for conflicted documents was done based on different resolution strategies that were mapped according to CDAR specification. The first part of the technique is focused in identifying the type of conflict in the blocks that arises after partitioning. The second part focuses on the resolution mapping of the conflicts based on the constraints applied depending on the validation and user scenario.

RESULTS

We used a publicly available dataset of CDAR documents, identified all conflicted documents and resolved all the them successfully to avoid any semantic loss. In our experiment we tested up to 87,000 CDAR documents and successfully identified the conflicts and resolved the semantic issues.

CONCLUSION

We have presented a novel study that focuses on the semantics of big data which did not compromise the performance and resolved the semantic issues risen during the processing of clinical documents.

摘要

背景

在当今的医院设置中,标准化的医疗保健文档的采用率很高。这带来了一些挑战,因为大规模处理这些文档会对基础设施造成影响。这些文档的复杂性加剧了处理它们的问题,这就是为什么需要应用大数据技术的原因。大数据技术的性质在将这些文档进行分区处理时可能会导致准确性/语义丢失。这种语义丢失对于临床使用以及保险或医学教育来说都是至关重要的。

方法

在本文中,我们提出了一种新的技术,通过基于临床文档标准和基于用户用例的一致性的约束模型,避免在大数据中对医疗保健文档进行常规分区时发生任何语义丢失。我们在 Hadoop 分布式文件系统(HDFS)上使用了临床文档架构(CDAR)数据集,并通过独特的配置设置进行了使用。我们在分区后识别了具有语义丢失的受影响文档,并将其分为两组:无冲突文档和冲突文档。冲突文档的解决方案是根据根据 CDAR 规范映射的不同解决方案策略来完成的。该技术的第一部分侧重于识别分区后块中出现的冲突类型。第二部分侧重于根据所应用的约束,根据验证和用户场景映射冲突的解决。

结果

我们使用了 CDAR 文档的公共可用数据集,识别了所有冲突文档,并成功解决了所有冲突,以避免任何语义丢失。在我们的实验中,我们测试了多达 87,000 个 CDAR 文档,并成功地识别了冲突并解决了语义问题。

结论

我们提出了一项新的研究,重点关注大数据的语义,这不会影响性能,并解决了在处理临床文档时出现的语义问题。

相似文献

1
Semantic preservation of standardized healthcare documents in big data.大数据中标准化医疗保健文档的语义保留。
Int J Med Inform. 2019 Sep;129:133-145. doi: 10.1016/j.ijmedinf.2019.05.024. Epub 2019 Jun 7.
2
An Efficient Parallelized Ontology Network-Based Semantic Similarity Measure for Big Biomedical Document Clustering.一种用于大规模生物医学文档聚类的基于有效并行化本体网络的语义相似度度量方法。
Comput Math Methods Med. 2021 Nov 9;2021:7937573. doi: 10.1155/2021/7937573. eCollection 2021.
3
A Semantic-Based Approach for Managing Healthcare Big Data: A Survey.基于语义的医疗保健大数据管理方法:调查。
J Healthc Eng. 2020 Nov 23;2020:8865808. doi: 10.1155/2020/8865808. eCollection 2020.
4
Semantic validation of standard-based electronic health record documents with W3C XML schema.使用W3C XML模式对基于标准的电子健康记录文档进行语义验证。
Methods Inf Med. 2010;49(3):271-80. doi: 10.3414/ME09-02-0027. Epub 2010 Apr 20.
5
Exploiting the semantic graph for the representation and retrieval of medical documents.利用语义图进行医学文献的表示和检索。
Comput Biol Med. 2018 Oct 1;101:39-50. doi: 10.1016/j.compbiomed.2018.08.009. Epub 2018 Aug 7.
6
Handling Big Data Scalability in Biological Domain Using Parallel and Distributed Processing: A Case of Three Biological Semantic Similarity Measures.使用并行和分布式处理处理生物领域的大数据可扩展性:三个生物语义相似性度量的案例。
Biomed Res Int. 2019 Jan 27;2019:6750296. doi: 10.1155/2019/6750296. eCollection 2019.
7
Composite CDE: modeling composite relationships between common data elements for representing complex clinical data.组合 CDE:建模常见数据元素之间的组合关系,用于表示复杂的临床数据。
BMC Med Inform Decis Mak. 2020 Jul 3;20(1):147. doi: 10.1186/s12911-020-01168-0.
8
Terminology Coverage from Semantic Annotated Health Documents.语义标注健康文档中的术语覆盖范围
Stud Health Technol Inform. 2018;255:20-24.
9
Semantic mapping to simplify deployment of HL7 v3 Clinical Document Architecture.语义映射简化 HL7 v3 临床文档架构的部署。
J Biomed Inform. 2012 Aug;45(4):697-702. doi: 10.1016/j.jbi.2012.02.006. Epub 2012 Mar 24.
10
tESA: a distributional measure for calculating semantic relatedness.tESA:一种用于计算语义相关性的分布度量。
J Biomed Semantics. 2016 Dec 28;7(1):67. doi: 10.1186/s13326-016-0109-6.

引用本文的文献

1
Systematic analysis of healthcare big data analytics for efficient care and disease diagnosing.系统分析医疗保健大数据分析,以实现高效护理和疾病诊断。
Sci Rep. 2022 Dec 26;12(1):22377. doi: 10.1038/s41598-022-26090-5.
2
The use of Big Data Analytics in healthcare.大数据分析在医疗保健领域的应用。
J Big Data. 2022;9(1):3. doi: 10.1186/s40537-021-00553-4. Epub 2022 Jan 6.