Miller Trudi, Leroy Gondy, Wood Elizabeth
School of Information Systems & Technology, Claremont Graduate University, Claremont, California, USA.
AMIA Annu Symp Proc. 2006;2006:559-63.
Consumers increasingly look to the Internet for health information, but available resources are too difficult for the majority to understand. Interactive tables of contents (TOC) can help consumers access health information by providing an easy to understand structure. Using natural language processing and the Unified Medical Language System (UMLS), we have automatically generated TOCs for consumer health information. The TOC are categorized according to consumer-friendly labels for the UMLS semantic types and semantic groups. Categorizing phrases by semantic types is significantly more correct and relevant. Greater correctness and relevance was achieved with documents that are difficult to read than those at an easier reading level. Pruning TOCs to use categories that consumers favor further increases relevancy and correctness while reducing structural complexity.
消费者越来越多地在互联网上查找健康信息,但现有的资源对大多数人来说太难理解。交互式目录(TOC)可以通过提供易于理解的结构来帮助消费者获取健康信息。利用自然语言处理和统一医学语言系统(UMLS),我们已经自动为消费者健康信息生成了目录。这些目录根据UMLS语义类型和语义组的用户友好标签进行分类。按语义类型对短语进行分类要准确得多且更具相关性。对于难读的文档,比阅读难度较低的文档能实现更高的准确性和相关性。删减目录以使用消费者喜欢的类别,在降低结构复杂性的同时进一步提高相关性和准确性。