Sabir Tehseen F, Hauser Susan E, Thoma George R
National Library of Medicine, NIH, DHHS, Bethesda, MD, USA.
AMIA Annu Symp Proc. 2006;2006:1082.
High OCR error rates encountered in author affiliations increase the manual labor needed to verify MEDLINE citations automatically created from scanned journal articles. This is due to poor OCR recognition of the small text and italics frequently used in printed affiliations. Using author-affiliation relationships found in existing MEDLINE records, the SeekAffiliation (SA) program automatically finds potentially correct and complete affiliations, thereby reducing manual effort and increasing the efficiency of creating the citations.
在作者单位信息中遇到的高光学字符识别(OCR)错误率增加了自动验证从扫描期刊文章中自动创建的MEDLINE引文所需的人工工作量。这是由于印刷单位信息中经常使用的小文本和斜体字的OCR识别效果不佳所致。利用现有MEDLINE记录中发现的作者-单位关系,SeekAffiliation(SA)程序可自动找到可能正确且完整的单位信息,从而减少人工工作量并提高创建引文的效率。