EliXR：一种资格标准提取和表示方法。

EliXR: an approach to eligibility criteria extraction and representation.

机构信息

Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA.

出版信息

J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i116-24. doi: 10.1136/amiajnl-2011-000321. Epub 2011 Jul 31.

DOI:10.1136/amiajnl-2011-000321

PMID:21807647

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3241167/

Abstract

OBJECTIVE

To develop a semantic representation for clinical research eligibility criteria to automate semistructured information extraction from eligibility criteria text.

MATERIALS AND METHODS

An analysis pipeline called eligibility criteria extraction and representation (EliXR) was developed that integrates syntactic parsing and tree pattern mining to discover common semantic patterns in 1000 eligibility criteria randomly selected from http://ClinicalTrials.gov. The semantic patterns were aggregated and enriched with unified medical language systems semantic knowledge to form a semantic representation for clinical research eligibility criteria.

RESULTS

The authors arrived at 175 semantic patterns, which form 12 semantic role labels connected by their frequent semantic relations in a semantic network.

EVALUATION

Three raters independently annotated all the sentence segments (N=396) for 79 test eligibility criteria using the 12 top-level semantic role labels. Eight-six per cent (339) of the sentence segments were unanimously labelled correctly and 13.8% (55) were correctly labelled by two raters. The Fleiss' κ was 0.88, indicating a nearly perfect interrater agreement.

CONCLUSION

This study present a semi-automated data-driven approach to developing a semantic network that aligns well with the top-level information structure in clinical research eligibility criteria text and demonstrates the feasibility of using the resulting semantic role labels to generate semistructured eligibility criteria with nearly perfect interrater reliability.

摘要

目的

开发一种临床研究入选标准的语义表示，以实现从入选标准文本中自动进行半结构化信息提取。

材料与方法

开发了一个名为入选标准抽取与表示（EliXR）的分析管道，该管道集成了句法分析和树模式挖掘，以从 http://ClinicalTrials.gov 中随机抽取的 1000 条入选标准中发现常见的语义模式。这些语义模式经过汇总并与统一医学语言系统语义知识进行了丰富，以形成临床研究入选标准的语义表示。

结果

作者共得出 175 种语义模式，这些模式形成了一个语义网络，其中包含 12 个语义角色标签，通过它们的频繁语义关系连接在一起。

评估

3 名评估者独立使用 12 个顶级语义角色标签对 79 条测试入选标准的所有句子片段（N=396）进行了标注。86%（339）的句子片段得到了一致正确的标注，13.8%（55）的句子片段得到了两名评估者的正确标注。Fleiss' κ 值为 0.88，表明评估者之间存在近乎完美的一致性。

结论

本研究提出了一种半自动的数据驱动方法，用于开发一个与临床研究入选标准文本中的顶级信息结构很好对齐的语义网络，并展示了使用由此产生的语义角色标签生成半结构化入选标准的可行性，具有近乎完美的评估者间可靠性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

EliXR：一种资格标准提取和表示方法。

EliXR: an approach to eligibility criteria extraction and representation.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

EVALUATION

CONCLUSION

目的

材料与方法

结果

评估

结论

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

EliXR：一种资格标准提取和表示方法。

EliXR: an approach to eligibility criteria extraction and representation.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

EVALUATION

CONCLUSION

目的

材料与方法

结果

评估

结论