Koike Asako, Takagi Toshihisa
Dept. of Computational Biology, Graduate School of Frontier Science, The University of Tokyo, Kiban-3A1(CB01) 5-1-5, Kashiwanoha Kashiwa, Chiba, 277-8561, Japan.
In Silico Biol. 2005;5(1):9-20.
With the exponentially increasing amount of information in the biomedical field, the significance of advanced information retrieval and information extraction, as well as the role of databases, has been increasing. PRIME is an integrated gene/protein informatics database based on natural language processing. It provides automatically extracted protein/family/gene/compound interaction information including both physical and genetic interactions, gene ontology based functions, and graphic pathway viewers. Gene/protein/family names and functional terms are recognized based on dictionaries developed in our laboratory. The interaction and functional information are extracted by syntactic dependencies and various phrase patterns. We have included about 920,000 (non-redundant) protein interactions and 360,000 annotated gene-function relationships for major eukaryotes. By combining the sequence and text information, the pathway comparison between two organisms and simple pathway deduction based on other organism interaction data, and pathway filtering using tissue expression data, are also available. This database is accessible at http://prime.ontology.ims.u-tokyo.ac.jp:8081.
随着生物医学领域信息呈指数级增长,先进的信息检索与信息提取的重要性以及数据库的作用也日益凸显。PRIME是一个基于自然语言处理的综合基因/蛋白质信息学数据库。它提供自动提取的蛋白质/家族/基因/化合物相互作用信息,包括物理和遗传相互作用、基于基因本体论的功能以及图形化通路查看器。基因/蛋白质/家族名称和功能术语是根据我们实验室开发的词典识别出来的。相互作用和功能信息通过句法依存关系和各种短语模式提取。我们已经纳入了约920,000条(非冗余)蛋白质相互作用信息以及主要真核生物的360,000条注释的基因-功能关系。通过结合序列和文本信息,还可以进行两种生物体之间的通路比较、基于其他生物体相互作用数据的简单通路推导以及使用组织表达数据进行通路筛选。该数据库可通过http://prime.ontology.ims.u-tokyo.ac.jp:8081访问。