Department of Diagnostic and Biological Sciences, University of Minnesota School of Dentistry, Minneapolis, MN 55455, USA.
BMC Musculoskelet Disord. 2012 Jul 3;13:119. doi: 10.1186/1471-2474-13-119.
Sjögren's syndrome is a tissue-specific autoimmune disease that affects exocrine tissues, especially salivary glands and lacrimal glands. Despite a large body of evidence gathered over the past 60 years, significant gaps still exist in our understanding of Sjögren's syndrome. The goal of this study was to develop a database that collects and organizes gene and protein expression data from the existing literature for comparative analysis with future gene expression and proteomic studies of Sjögren's syndrome.
To catalog the existing knowledge in the field, we used text mining to generate the Sjögren's Syndrome Knowledge Base (SSKB) of published gene/protein data, which were extracted from PubMed using text mining of over 7,700 abstracts and listing approximately 500 potential genes/proteins. The raw data were manually evaluated to remove duplicates and false-positives and assign gene names. The data base was manually curated to 477 entries, including 377 potential functional genes, which were used for enrichment and pathway analysis using gene ontology and KEGG pathway analysis.
The Sjögren's syndrome knowledge base ( http://sskb.umn.edu) can form the foundation for an informed search of existing knowledge in the field as new potential therapeutic targets are identified by conventional or high throughput experimental techniques.
干燥综合征是一种组织特异性自身免疫性疾病,影响外分泌组织,特别是唾液腺和泪腺。尽管在过去的 60 年里积累了大量的证据,但我们对干燥综合征的理解仍存在很大的差距。本研究的目的是开发一个数据库,收集和组织现有文献中的基因和蛋白质表达数据,以便与未来干燥综合征的基因表达和蛋白质组学研究进行比较分析。
为了编目该领域现有的知识,我们使用文本挖掘技术从超过 7700 篇摘要中提取并列出了大约 500 个潜在基因/蛋白质,生成了发表的基因/蛋白质数据的干燥综合征知识库(SSKB)。原始数据经过手动评估,以去除重复项和假阳性,并分配基因名称。该数据库经过手动整理,共有 477 条条目,包括 377 个潜在功能基因,用于使用基因本体论和 KEGG 途径分析进行富集和途径分析。
干燥综合征知识库(http://sskb.umn.edu)可以作为一个基础,为新的潜在治疗靶点提供信息搜索,这些靶点是通过常规或高通量实验技术确定的。