基于特征的临床试验索引、聚类和搜索的可行性。以ClinicalTrials.gov上的乳腺癌试验为例。

Feasibility of feature-based indexing, clustering, and search of clinical trials. A case study of breast cancer trials from ClinicalTrials.gov.

作者信息

Boland M R, Miotto R, Gao J, Weng C

机构信息

Chunhua Weng, PhD, Florence Irving Assistant Professor, Department of Biomedical Informatics, Columbia University, 622 W 168th Street, VC-5 New York, NY 10032 USA, E-mail:

出版信息

Methods Inf Med. 2013;52(5):382-94. doi: 10.3414/ME12-01-0092. Epub 2013 May 13.

DOI:10.3414/ME12-01-0092

PMID:23666475

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3796134/

Abstract

BACKGROUND

When standard therapies fail, clinical trials provide experimental treatment opportunities for patients with drug-resistant illnesses or terminal diseases. Clinical Trials can also provide free treatment and education for individuals who otherwise may not have access to such care. To find relevant clinical trials, patients often search online; however, they often encounter a significant barrier due to the large number of trials and in-effective indexing methods for reducing the trial search space.

OBJECTIVES

This study explores the feasibility of feature-based indexing, clustering, and search of clinical trials and informs designs to automate these processes.

METHODS

We decomposed 80 randomly selected stage III breast cancer clinical trials into a vector of eligibility features, which were organized into a hierarchy. We clustered trials based on their eligibility feature similarities. In a simulated search process, manually selected features were used to generate specific eligibility questions to filter trials iteratively.

RESULTS

We extracted 1,437 distinct eligibility features and achieved an inter-rater agreement of 0.73 for feature extraction for 37 frequent features occurring in more than 20 trials. Using all the 1,437 features we stratified the 80 trials into six clusters containing trials recruiting similar patients by patient-characteristic features, five clusters by disease-characteristic features, and two clusters by mixed features. Most of the features were mapped to one or more Unified Medical Language System (UMLS) concepts, demonstrating the utility of named entity recognition prior to mapping with the UMLS for automatic feature extraction.

CONCLUSIONS

It is feasible to develop feature-based indexing and clustering methods for clinical trials to identify trials with similar target populations and to improve trial search efficiency.

摘要

背景

当标准疗法失败时，临床试验为患有耐药性疾病或绝症的患者提供了实验性治疗机会。临床试验还可以为那些原本无法获得此类治疗的个人提供免费治疗和教育。为了找到相关的临床试验，患者通常会在网上搜索；然而，由于试验数量众多以及用于减少试验搜索空间的索引方法无效，他们经常遇到重大障碍。

目的

本研究探讨基于特征的索引、聚类和搜索临床试验的可行性，并为自动化这些过程的设计提供信息。

方法

我们将80个随机选择的III期乳腺癌临床试验分解为资格特征向量，并将其组织成一个层次结构。我们根据试验的资格特征相似性对试验进行聚类。在模拟搜索过程中，使用手动选择的特征生成特定的资格问题，以迭代方式筛选试验。

结果

我们提取了1437个不同的资格特征，对于在20多个试验中出现的37个常见特征的特征提取，评分者间一致性达到0.73。使用所有1437个特征，我们将80个试验分为六个聚类，这些聚类按患者特征招募相似患者的试验，按疾病特征分为五个聚类，按混合特征分为两个聚类。大多数特征被映射到一个或多个统一医学语言系统（UMLS）概念，这表明在与UMLS映射以进行自动特征提取之前，命名实体识别的实用性。

结论

开发基于特征的临床试验索引和聚类方法以识别具有相似目标人群的试验并提高试验搜索效率是可行的。

相似文献

Feasibility of feature-based indexing, clustering, and search of clinical trials. A case study of breast cancer trials from ClinicalTrials.gov.

Methods Inf Med. 2013;52(5):382-94. doi: 10.3414/ME12-01-0092. Epub 2013 May 13.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Clustering clinical trials with similar eligibility criteria features.

J Biomed Inform. 2014 Dec;52:112-20. doi: 10.1016/j.jbi.2014.01.009. Epub 2014 Feb 1.

Unsupervised mining of frequent tags for clinical eligibility text indexing.

J Biomed Inform. 2013 Dec;46(6):1145-51. doi: 10.1016/j.jbi.2013.08.012. Epub 2013 Sep 10.

Text-based multi-dimensional medical images retrieval according to the features-usage correlation.

Med Biol Eng Comput. 2021 Oct;59(10):1993-2017. doi: 10.1007/s11517-021-02392-0. Epub 2021 Aug 20.

k-Neighborhood decentralization: a comprehensive solution to index the UMLS for large scale knowledge discovery.

J Biomed Inform. 2012 Apr;45(2):323-36. doi: 10.1016/j.jbi.2011.11.012. Epub 2011 Dec 2.

Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

J Biomed Inform. 2015 Aug;56:57-64. doi: 10.1016/j.jbi.2015.04.013. Epub 2015 May 19.

Visual aggregate analysis of eligibility features of clinical trials.

J Biomed Inform. 2015 Apr;54:241-55. doi: 10.1016/j.jbi.2015.01.005. Epub 2015 Jan 20.

DQueST: dynamic questionnaire for search of clinical trials.

J Am Med Inform Assoc. 2019 Nov 1;26(11):1333-1343. doi: 10.1093/jamia/ocz121.

A novel feature selection strategy for enhanced biomedical event extraction using the Turku system.

Biomed Res Int. 2014;2014:205239. doi: 10.1155/2014/205239. Epub 2014 Apr 6.

引用本文的文献

Medical concept normalization in clinical trials with drug and disease representation learning.

Bioinformatics. 2021 Nov 5;37(21):3856-3864. doi: 10.1093/bioinformatics/btab474.

Automatic classification of registered clinical trials towards the Global Burden of Diseases taxonomy of diseases and injuries.

BMC Bioinformatics. 2016 Sep 22;17(1):392. doi: 10.1186/s12859-016-1247-7.

Valx: A System for Extracting and Structuring Numeric Lab Test Comparison Statements from Text.

Methods Inf Med. 2016 May 17;55(3):266-75. doi: 10.3414/ME15-01-0112. Epub 2016 Mar 4.

Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials.

J Am Med Inform Assoc. 2015 Apr;22(e1):e141-50. doi: 10.1093/jamia/ocu050. Epub 2015 Mar 13.

Visual aggregate analysis of eligibility features of clinical trials.

J Biomed Inform. 2015 Apr;54:241-55. doi: 10.1016/j.jbi.2015.01.005. Epub 2015 Jan 20.

ClinicalTrials.gov as a data source for semi-automated point-of-care trial eligibility screening.

PLoS One. 2014 Oct 21;9(10):e111055. doi: 10.1371/journal.pone.0111055. eCollection 2014.

Adaptive semantic tag mining from heterogeneous clinical research texts.

Methods Inf Med. 2015;54(2):164-70. doi: 10.3414/ME13-01-0130. Epub 2014 Oct 20.

A distribution-based method for assessing the differences between clinical trial target populations and patient populations in electronic health records.

Appl Clin Inform. 2014 May 7;5(2):463-79. doi: 10.4338/ACI-2013-12-RA-0105. eCollection 2014.

Employing computers for the recruitment into clinical trials: a comprehensive systematic review.

J Med Internet Res. 2014 Jul 1;16(7):e161. doi: 10.2196/jmir.3446.

Clustering clinical trials with similar eligibility criteria features.

J Biomed Inform. 2014 Dec;52:112-20. doi: 10.1016/j.jbi.2014.01.009. Epub 2014 Feb 1.

本文引用的文献

Health information search to deal with the exploding amount of health information produced.

Methods Inf Med. 2012;51(6):516-8.

A pragmatic method for electronic medical record-based observational studies: developing an electronic medical records retrieval system for clinical research.

BMJ Open. 2012 Oct 31;2(6). doi: 10.1136/bmjopen-2012-001622. Print 2012.

An architecture for diversity-aware search for medical web content.

Methods Inf Med. 2012;51(6):549-56. doi: 10.3414/ME11-02-0022. Epub 2012 Oct 19.

ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials.

BMC Med Inform Decis Mak. 2012 Apr 30;12 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-12-S1-S3.

Phase II study of bevacizumab in combination with trastuzumab and capecitabine as first-line treatment for HER-2-positive locally recurrent or metastatic breast cancer.

Oncologist. 2012;17(4):469-75. doi: 10.1634/theoncologist.2011-0344. Epub 2012 Mar 30.

Efficacy and cost-effectiveness of an automated screening algorithm in an inpatient clinical trial.

Clin Trials. 2012 Apr;9(2):198-203. doi: 10.1177/1740774511434844. Epub 2012 Feb 3.

A vector space model approach to identify genetically related diseases.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):249-54. doi: 10.1136/amiajnl-2011-000480. Epub 2012 Jan 6.

ResearchMatch: a national registry to recruit volunteers for clinical research.

Acad Med. 2012 Jan;87(1):66-73. doi: 10.1097/ACM.0b013e31823ab7d2.

Boosting enrolment in clinical trials: validation of a regional network model.

Clin Trials. 2011 Oct;8(5):645-53. doi: 10.1177/1740774511414925. Epub 2011 Aug 8.

EliXR: an approach to eligibility criteria extraction and representation.

J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i116-24. doi: 10.1136/amiajnl-2011-000321. Epub 2011 Jul 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于特征的临床试验索引、聚类和搜索的可行性。以ClinicalTrials.gov上的乳腺癌试验为例。

Feasibility of feature-based indexing, clustering, and search of clinical trials. A case study of breast cancer trials from ClinicalTrials.gov.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献