Intelligent Agent Systems Lab, Institute of Information Science, Academia Sinica, Taipei.
Database (Oxford). 2013 Feb 12;2013:bas061. doi: 10.1093/database/bas061. Print 2013.
Researchers are finding it more and more difficult to follow the changing status of disease candidate genes due to the exponential increase in gene mapping studies. The Text-mined Hypertension, Obesity and Diabetes candidate gene database (T-HOD) is developed to help trace existing research on three kinds of cardiovascular diseases: hypertension, obesity and diabetes, with the last disease categorized into Type 1 and Type 2, by regularly and semiautomatically extracting HOD-related genes from newly published literature. Currently, there are 837, 835 and 821 candidate genes recorded in T-HOD for hypertension, obesity and diabetes, respectively. T-HOD employed the state-of-art text-mining technologies, including a gene/disease identification system and a disease-gene relation extraction system, which can be used to affirm the association of genes with three diseases and provide more evidence for further studies. The primary inputs of T-HOD are the three kinds of diseases, and the output is a list of disease-related genes that can be ranked based on their number of appearance, protein-protein interactions and single-nucleotide polymorphisms. Unlike manually constructed disease gene databases, the content of T-HOD is regularly updated by our text-mining system and verified by domain experts. The interface of T-HOD facilitates easy browsing for users and allows T-HOD curators to verify data efficiently. We believe that T-HOD can help life scientists in search for more disease candidate genes in a less time- and effort-consuming manner. Database URL: http://bws.iis.sinica.edu.tw/THOD.
研究人员发现,由于基因图谱研究呈指数级增长,越来越难以跟踪疾病候选基因的变化状态。Text-mined Hypertension、Obesity and Diabetes candidate gene database(T-HOD)是为了帮助追踪三种心血管疾病(高血压、肥胖症和糖尿病)的现有研究而开发的,最后一种疾病分为 1 型和 2 型,方法是定期和半自动地从新发表的文献中提取与 HOD 相关的基因。目前,T-HOD 分别为高血压、肥胖症和糖尿病记录了 837、835 和 821 个候选基因。T-HOD 采用了最先进的文本挖掘技术,包括基因/疾病识别系统和疾病-基因关系提取系统,可用于确认基因与三种疾病的关联,并为进一步研究提供更多证据。T-HOD 的主要输入是三种疾病,输出是与疾病相关的基因列表,可根据其出现次数、蛋白质-蛋白质相互作用和单核苷酸多态性进行排名。与手动构建的疾病基因数据库不同,T-HOD 的内容由我们的文本挖掘系统定期更新,并由领域专家验证。T-HOD 的界面方便用户浏览,并允许 T-HOD 管理员高效地验证数据。我们相信,T-HOD 可以帮助生命科学家以更少的时间和精力搜索更多的疾病候选基因。数据库网址:http://bws.iis.sinica.edu.tw/THOD。