Rasmussen Luke V, Thompson Will K, Pacheco Jennifer A, Kho Abel N, Carrell David S, Pathak Jyotishman, Peissig Peggy L, Tromp Gerard, Denny Joshua C, Starren Justin B
Feinberg School of Medicine, Northwestern University, Chicago, IL, United States.
Feinberg School of Medicine, Northwestern University, Chicago, IL, United States; Center for Biomedical Research Informatics, NorthShore University HealthSystem, Evanston, IL, United States.
J Biomed Inform. 2014 Oct;51:280-6. doi: 10.1016/j.jbi.2014.06.007. Epub 2014 Jun 21.
Design patterns, in the context of software development and ontologies, provide generalized approaches and guidance to solving commonly occurring problems, or addressing common situations typically informed by intuition, heuristics and experience. While the biomedical literature contains broad coverage of specific phenotype algorithm implementations, no work to date has attempted to generalize common approaches into design patterns, which may then be distributed to the informatics community to efficiently develop more accurate phenotype algorithms.
Using phenotyping algorithms stored in the Phenotype KnowledgeBase (PheKB), we conducted an independent iterative review to identify recurrent elements within the algorithm definitions. We extracted and generalized recurrent elements in these algorithms into candidate patterns. The authors then assessed the candidate patterns for validity by group consensus, and annotated them with attributes.
A total of 24 electronic Medical Records and Genomics (eMERGE) phenotypes available in PheKB as of 1/25/2013 were downloaded and reviewed. From these, a total of 21 phenotyping patterns were identified, which are available as an online data supplement.
Repeatable patterns within phenotyping algorithms exist, and when codified and cataloged may help to educate both experienced and novice algorithm developers. The dissemination and application of these patterns has the potential to decrease the time to develop algorithms, while improving portability and accuracy.
在软件开发和本体论的背景下,设计模式提供了通用的方法和指导,用于解决常见问题,或处理通常由直觉、启发式方法和经验所指导的常见情况。虽然生物医学文献广泛涵盖了特定表型算法的实现,但迄今为止,尚无工作尝试将通用方法归纳为设计模式,而这些模式随后可分发给信息学社区,以高效开发更准确的表型算法。
利用存储在表型知识库(PheKB)中的表型算法,我们进行了独立的迭代审查,以识别算法定义中的重复元素。我们提取了这些算法中的重复元素并将其归纳为候选模式。作者随后通过小组共识评估了候选模式的有效性,并用属性对其进行了注释。
截至2013年1月25日,PheKB中共有24种电子病历与基因组学(eMERGE)表型被下载并审查。从中总共识别出21种表型模式,这些模式可作为在线数据补充资料获取。
表型算法中存在可重复的模式,将其编纂和编目可能有助于培训有经验的和新手算法开发者。这些模式的传播和应用有可能减少算法开发时间,同时提高可移植性和准确性。