• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HERALD:一种针对纵向健康数据分析的领域特定查询语言。

HERALD: A domain-specific query language for longitudinal health data analytics.

机构信息

Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Center of Health Data Science, Berlin, Germany.

Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Center of Health Data Science, Berlin, Germany.

出版信息

Int J Med Inform. 2024 Dec;192:105646. doi: 10.1016/j.ijmedinf.2024.105646. Epub 2024 Oct 5.

DOI:10.1016/j.ijmedinf.2024.105646
PMID:39393126
Abstract

BACKGROUND

Large-scale health data has significant potential for research and innovation, especially with longitudinal data offering insights into prevention, disease progression, and treatment effects. Yet, analyzing this data type is complex, as data points are repeatedly documented along the timeline. As a consequence, extracting cross-sectional tabular data suitable for statistical analysis and machine learning can be challenging for medical researchers and data scientists alike, with existing tools lacking balance between ease-of-use and comprehensiveness.

OBJECTIVE

This paper introduces HERALD, a novel domain-specific query language designed to support the transformation of longitudinal health data into cross-sectional tables. We describe the basic concepts, the query syntax, a graphical user interface for constructing and executing HERALD queries, as well as an integration into Informatics for Integrating Biology and the Bedside (i2b2).

METHODS

The syntax of HERALD mimics natural language and supports different query types for selection, aggregation, analysis of relationships, and searching for data points based on filter expressions and temporal constraints. Using a hierarchical concept model, queries are executed individually for the data of each patient, while constructing tabular output. HERALD is closed, meaning that queries process data points and generate data points. Queries can refer to data points that have been produced by previous queries, providing a simple, but powerful nesting mechanism.

RESULTS

The open-source implementation consists of a HERALD query parser, an execution engine, as well as a web-based user interface for query construction and statistical analysis. The implementation can be deployed as a standalone component and integrated into self-service data analytics environments like i2b2 as a plugin. HERALD can be valuable tool for data scientists and machine learning experts, as it simplifies the process of transforming longitudinal health data into tables and data matrices.

CONCLUSION

The construction of cross-sectional tables from longitudinal data can be supported through dedicated query languages that strike a reasonable balance between language complexity and transformation capabilities.

摘要

背景

大规模健康数据具有重要的研究和创新潜力,尤其是纵向数据可以深入了解预防、疾病进展和治疗效果。然而,分析这种数据类型非常复杂,因为数据点会沿着时间轴反复记录。因此,对于医学研究人员和数据科学家来说,提取适合统计分析和机器学习的横截面表格数据可能具有挑战性,现有的工具在易用性和全面性之间缺乏平衡。

目的

本文介绍了 HERALD,这是一种专门设计的领域特定查询语言,用于将纵向健康数据转换为横截面表格。我们描述了基本概念、查询语法、用于构建和执行 HERALD 查询的图形用户界面,以及与 Informatics for Integrating Biology and the Bedside (i2b2) 的集成。

方法

HERALD 的语法模仿自然语言,支持不同的查询类型,用于选择、聚合、分析关系以及根据过滤表达式和时间约束搜索数据点。使用分层概念模型,为每个患者的数据单独执行查询,同时构建表格输出。HERALD 是封闭的,这意味着查询处理数据点并生成数据点。查询可以引用先前查询生成的数据点,提供了一种简单但强大的嵌套机制。

结果

开源实现包括 HERALD 查询解析器、执行引擎以及用于查询构建和统计分析的基于 Web 的用户界面。该实现可以作为独立组件部署,并作为插件集成到自助式数据分析环境(如 i2b2)中。HERALD 对于数据科学家和机器学习专家来说可能是一个有价值的工具,因为它简化了将纵向健康数据转换为表格和数据矩阵的过程。

结论

通过专门的查询语言可以支持从纵向数据构建横截面表格,该语言在语言复杂性和转换能力之间取得了合理的平衡。

相似文献

1
HERALD: A domain-specific query language for longitudinal health data analytics.HERALD:一种针对纵向健康数据分析的领域特定查询语言。
Int J Med Inform. 2024 Dec;192:105646. doi: 10.1016/j.ijmedinf.2024.105646. Epub 2024 Oct 5.
2
Visually defining and querying consistent multi-granular clinical temporal abstractions.直观定义和查询一致的多粒度临床时间抽象。
Artif Intell Med. 2012 Feb;54(2):75-101. doi: 10.1016/j.artmed.2011.10.004. Epub 2011 Dec 15.
3
Implementation of a query interface for a generic record server.为通用记录服务器实现查询接口。
Int J Med Inform. 2008 Nov;77(11):754-64. doi: 10.1016/j.ijmedinf.2008.05.003. Epub 2008 Jul 9.
4
Computing health quality measures using Informatics for Integrating Biology and the Bedside.使用整合生物学与床边信息学计算健康质量指标。
J Med Internet Res. 2013 Apr 19;15(4):e75. doi: 10.2196/jmir.2493.
5
Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.自然语言搜索界面:健康数据需要单字段变量搜索。
J Med Internet Res. 2016 Jan 14;18(1):e13. doi: 10.2196/jmir.4912.
6
EHR query language (EQL)--a query language for archetype-based health records.电子健康记录查询语言(EQL)——一种用于基于原型的健康记录的查询语言。
Stud Health Technol Inform. 2007;129(Pt 1):397-401.
7
Information retrieval: an overview of system characteristics.信息检索:系统特征概述
Int J Med Inform. 1997 Nov;47(1-2):5-26. doi: 10.1016/s1386-5056(97)00094-4.
8
Structured Query Language (SQL) fundamentals.结构化查询语言(SQL)基础。
Curr Protoc Bioinformatics. 2003 Feb;Chapter 9:Unit9.2. doi: 10.1002/0471250953.bi0902s00.
9
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.生物提示框:一种用于在生物数据库中搜索的基于本体的聚类工具。
BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2105-8-S1-S8.
10
Intelligent querying and exploration of multiple time-oriented medical records.对多份面向时间的医疗记录进行智能查询与探索。
Stud Health Technol Inform. 2007;129(Pt 2):1314-8.