结合文本分类和隐马尔可夫建模技术对随机临床试验摘要中的句子进行分类。

Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts.

作者信息

Xu Rong, Supekar Kaustubh, Huang Yang, Das Amar, Garber Alan

机构信息

Biomedical Informatics Training Program, Stanford Medical Informatics, Stanford University School of Medicine, Stanford University, Stanford, CA, USA.

出版信息

AMIA Annu Symp Proc. 2006;2006:824-8.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1839538/

Abstract

Randomized clinical trials (RCT) papers provide reliable information about efficacy of medical interventions. Current keyword based search methods to retrieve medical evidence,overload users with irrelevant information as these methods often do not take in to consideration semantics encoded within abstracts and the search query. Personalized semantic search, intelligent clinical question answering and medical evidence summarization aim to solve this information overload problem. Most of these approaches will significantly benefit if the information available in the abstracts is structured into meaningful categories (e.g., background, objective, method, result and conclusion). While many journals use structured abstract format, majority of RCT abstracts still remain unstructured.We have developed a novel automated approach to structure RCT abstracts by combining text classification and Hidden Markov Modeling(HMM) techniques. Results (precision: 0.98, recall: 0.99) of our approach significantly outperform previously reported work on automated categorization of sentences in RCT abstracts.

摘要

随机临床试验（RCT）论文提供了有关医学干预疗效的可靠信息。当前基于关键词的检索医学证据的方法，会让用户面对大量不相关信息，因为这些方法往往没有考虑摘要和检索查询中编码的语义。个性化语义检索、智能临床问题解答和医学证据总结旨在解决这种信息过载问题。如果摘要中的可用信息能被构建成有意义的类别（如背景、目的、方法、结果和结论），那么这些方法中的大多数将受益匪浅。虽然许多期刊采用结构化摘要格式，但大多数RCT摘要仍然是非结构化的。我们开发了一种新颖的自动化方法，通过结合文本分类和隐马尔可夫模型（HMM）技术来构建RCT摘要。我们方法的结果（精确率：0.98，召回率：0.99）显著优于先前报道的关于RCT摘要句子自动分类的工作。

相似文献

1

Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts.

AMIA Annu Symp Proc. 2006;2006:824-8.

2

Extracting subject demographic information from abstracts of randomized clinical trial reports.

Stud Health Technol Inform. 2007;129(Pt 1):550-4.

3

Medical textbook summarization and guided navigation using statistical sentence extraction.

AMIA Annu Symp Proc. 2005;2005:814-8.

4

Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

AMIA Annu Symp Proc. 2008 Nov 6;2008:820-4.

5

Categorization of sentence types in medical abstracts.

AMIA Annu Symp Proc. 2003;2003:440-4.

6

Sentence retrieval for abstracts of randomized controlled trials.

BMC Med Inform Decis Mak. 2009 Feb 10;9:10. doi: 10.1186/1472-6947-9-10.

7

Automated information extraction of key trial design elements from clinical trial publications.

AMIA Annu Symp Proc. 2008 Nov 6;2008:141-5.

8

Finding the evidence for protein-protein interactions from PubMed abstracts.

Bioinformatics. 2006 Jul 15;22(14):e220-6. doi: 10.1093/bioinformatics/btl203.

9

Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: method and clinical application.

Int J Med Inform. 2009 Dec;78(12):e1-6. doi: 10.1016/j.ijmedinf.2009.02.003. Epub 2009 Mar 26.

10

AliBaba: PubMed as a graph.

Bioinformatics. 2006 Oct 1;22(19):2444-5. doi: 10.1093/bioinformatics/btl408. Epub 2006 Jul 26.

引用本文的文献

1

Towards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach.

Bioinformatics. 2023 Sep 5;39(9). doi: 10.1093/bioinformatics/btad542.

2

Sharing Annotated Audio Recordings of Clinic Visits With Patients-Development of the Open Recording Automated Logging System (ORALS): Study Protocol.

JMIR Res Protoc. 2017 Jul 6;6(7):e121. doi: 10.2196/resprot.7735.

3

Extracting patient demographics and personal medical information from online health forums.

AMIA Annu Symp Proc. 2014 Nov 14;2014:1825-34. eCollection 2014.

4

Why Health Services Research Needs Geoinformatics: Rationale and Case Example.

J Health Med Inform. 2014 Dec;5(6).

5

Unsupervised mining of frequent tags for clinical eligibility text indexing.

J Biomed Inform. 2013 Dec;46(6):1145-51. doi: 10.1016/j.jbi.2013.08.012. Epub 2013 Sep 10.

6

eTACTS: a method for dynamically filtering clinical trial search results.

J Biomed Inform. 2013 Dec;46(6):1060-7. doi: 10.1016/j.jbi.2013.07.014. Epub 2013 Aug 3.

7

Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing.

BMC Bioinformatics. 2013 Jun 6;14:181. doi: 10.1186/1471-2105-14-181.

8

ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials.

BMC Med Inform Decis Mak. 2012 Apr 30;12 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-12-S1-S3.

9

Systematic identification of pharmacogenomics information from clinical trials.

J Biomed Inform. 2012 Oct;45(5):870-8. doi: 10.1016/j.jbi.2012.04.005. Epub 2012 Apr 24.

10

Recent progress in automatically extracting information from the pharmacogenomic literature.

Pharmacogenomics. 2010 Oct;11(10):1467-89. doi: 10.2217/pgs.10.136.

本文引用的文献

1

Categorization of sentence types in medical abstracts.

AMIA Annu Symp Proc. 2003;2003:440-4.

2

Electronic trial banks: a complementary method for reporting randomized trials.

Med Decis Making. 2000 Oct-Dec;20(4):440-50. doi: 10.1177/0272989X0002000408.

3

A proposal for more informative abstracts of clinical articles. Ad Hoc Working Group for Critical Appraisal of the Medical Literature.

Ann Intern Med. 1987 Apr;106(4):598-604.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。