Suppr超能文献

MaPLE:一种用于基于格的评估的MapReduce管道及其在SNOMED CT中的应用。

MaPLE: A MapReduce Pipeline for Lattice-based Evaluation and Its Application to SNOMED CT.

作者信息

Zhang Guo-Qiang, Zhu Wei, Sun Mengmeng, Tao Shiqiang, Bodenreider Olivier, Cui Licong

机构信息

Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106.

National Library of Medicine, Bethesda, MD 20892, USA.

出版信息

Proc IEEE Int Conf Big Data. 2014 Oct;2014:754-759. doi: 10.1109/BigData.2014.7004301.

Abstract

Non-lattice fragments are often indicative of structural anomalies in ontological systems and, as such, represent possible areas of focus for subsequent quality assurance work. However, extracting the non-lattice fragments in large ontological systems is computationally expensive if not prohibitive, using a traditional sequential approach. In this paper we present a general MapReduce pipeline, called MaPLE (MapReduce Pipeline for Lattice-based Evaluation), for extracting non-lattice fragments in large partially ordered sets and demonstrate its applicability in ontology quality assurance. Using MaPLE in a 30-node Hadoop local cloud, we systematically extracted non-lattice fragments in 8 SNOMED CT versions from 2009 to 2014 (each containing over 300k concepts), with an average total computing time of less than 3 hours per version. With dramatically reduced time, MaPLE makes it feasible not only to perform exhaustive structural analysis of large ontological hierarchies, but also to systematically track structural changes between versions. Our change analysis showed that the average change rates on the non-lattice pairs are up to 38.6 times higher than the change rates of the background structure (concept nodes). This demonstrates that fragments around non-lattice pairs exhibit significantly higher rates of change in the process of ontological evolution.

摘要

非格片段通常表明本体系统中存在结构异常,因此代表了后续质量保证工作可能关注的领域。然而,使用传统的顺序方法在大型本体系统中提取非格片段,即便不是完全不可行,计算成本也很高。在本文中,我们提出了一种通用的MapReduce管道,称为MaPLE(基于格的评估的MapReduce管道),用于在大型偏序集中提取非格片段,并展示了其在本体质量保证中的适用性。在一个由30个节点组成的Hadoop本地云中使用MaPLE,我们系统地提取了2009年至2014年8个SNOMED CT版本中的非格片段(每个版本包含超过30万个概念),每个版本的平均总计算时间不到3小时。随着时间大幅减少,MaPLE不仅使对大型本体层次结构进行详尽的结构分析成为可能,而且还能系统地跟踪不同版本之间的结构变化。我们的变化分析表明,非格对的平均变化率比背景结构(概念节点)的变化率高出多达38.6倍。这表明在本体进化过程中,非格对周围的片段表现出明显更高的变化率。

相似文献

1
MaPLE: A MapReduce Pipeline for Lattice-based Evaluation and Its Application to SNOMED CT.
Proc IEEE Int Conf Big Data. 2014 Oct;2014:754-759. doi: 10.1109/BigData.2014.7004301.
4
Mining Relation Reversals in the Evolution of SNOMED CT Using MapReduce.
AMIA Jt Summits Transl Sci Proc. 2015 Mar 23;2015:46-50. eCollection 2015.
5
Web-based Interactive Visualization of Non-Lattice Subgraphs (WINS) in SNOMED CT.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:740-749. eCollection 2020.
6
Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT.
J Am Med Inform Assoc. 2017 Jul 1;24(4):788-798. doi: 10.1093/jamia/ocw175.
7
An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies.
J Biomed Inform. 2018 Apr;80:106-119. doi: 10.1016/j.jbi.2018.03.004. Epub 2018 Mar 13.
8
Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs.
J Biomed Inform. 2018 Feb;78:177-184. doi: 10.1016/j.jbi.2017.12.010. Epub 2017 Dec 20.
9
Using SPARQL to Test for Lattices: application to quality assurance in biomedical ontologies.
Semant Web ISWC. 2010;6497:273-288. doi: 10.1007/978-3-642-17749-1_18.

引用本文的文献

1
Logical definition-based identification of potential missing concepts in SNOMED CT.
BMC Med Inform Decis Mak. 2023 May 9;23(Suppl 1):87. doi: 10.1186/s12911-023-02183-7.
2
Detecting missing IS-A relations in the NCI Thesaurus using an enhanced hybrid approach.
BMC Med Inform Decis Mak. 2020 Dec 15;20(Suppl 10):273. doi: 10.1186/s12911-020-01289-6.
3
A review of auditing techniques for the Unified Medical Language System.
J Am Med Inform Assoc. 2020 Oct 1;27(10):1625-1638. doi: 10.1093/jamia/ocaa108.
4
Web-based Interactive Visualization of Non-Lattice Subgraphs (WINS) in SNOMED CT.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:740-749. eCollection 2020.
7
Assessing the practice of biomedical ontology evaluation: Gaps and opportunities.
J Biomed Inform. 2018 Apr;80:1-13. doi: 10.1016/j.jbi.2018.02.010. Epub 2018 Feb 17.
8
Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs.
J Biomed Inform. 2018 Feb;78:177-184. doi: 10.1016/j.jbi.2017.12.010. Epub 2017 Dec 20.
9
Mining non-lattice subgraphs for detecting missing hierarchical relations and concepts in SNOMED CT.
J Am Med Inform Assoc. 2017 Jul 1;24(4):788-798. doi: 10.1093/jamia/ocw175.

本文引用的文献

1
MEDCIS: Multi-Modality Epilepsy Data Capture and Integration System.
AMIA Annu Symp Proc. 2014 Nov 14;2014:1248-57. eCollection 2014.
2
Using SPARQL to Test for Lattices: application to quality assurance in biomedical ontologies.
Semant Web ISWC. 2010;6497:273-288. doi: 10.1007/978-3-642-17749-1_18.
3
Dissecting the Ambiguity of FMA Concept Names Using Taxonomy and Partonomy Structural Information.
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:157-61. eCollection 2013.
4
An analysis of FMA using structural self-bisimilarity.
J Biomed Inform. 2013 Jun;46(3):497-505. doi: 10.1016/j.jbi.2013.03.005. Epub 2013 Apr 2.
6
Getting the foot out of the pelvis: modeling problems affecting use of SNOMED CT hierarchies in practical applications.
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):432-40. doi: 10.1136/amiajnl-2010-000045. Epub 2011 Apr 21.
7
VISAGE: A Query Interface for Clinical Research.
Summit Transl Bioinform. 2010 Mar 1;2010:76-80.
9
A review of auditing methods applied to the content of controlled biomedical terminologies.
J Biomed Inform. 2009 Jun;42(3):413-25. doi: 10.1016/j.jbi.2009.03.003. Epub 2009 Mar 12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验