Prasanna M D, Vondrasek Jiri, Wlodawer Alexander, Rodriguez H, Bhat T N
Biochemical Science Division (831), NIST, Gaithersburg, Maryland 20899-8314, USA.
Proteins. 2006 Jun 1;63(4):907-17. doi: 10.1002/prot.20914.
A novel technique to annotate, query, and analyze chemical compounds has been developed and is illustrated by using the inhibitor data on HIV protease-inhibitor complexes. In this method, all chemical compounds are annotated in terms of standard chemical structural fragments. These standard fragments are defined by using criteria, such as chemical classification; structural, chemical, or functional groups; and commercial, scientific or common names or synonyms. These fragments are then organized into a data tree based on their chemical substructures. Search engines have been developed to use this data tree to enable query on inhibitors of HIV protease (http://xpdb.nist.gov/hivsdb/hivsdb.html). These search engines use a new novel technique, Chemical Block Layered Alignment of Substructure Technique (Chem-BLAST) to search on the fragments of an inhibitor to look for its chemical structural neighbors. This novel technique to annotate and query compounds lays the foundation for the use of the Semantic Web concept on chemical compounds to allow end users to group, sort, and search structural neighbors accurately and efficiently. During annotation, it enables the attachment of "meaning" (i.e., semantics) to data in a manner that far exceeds the current practice of associating "metadata" with data by creating a knowledge base (or ontology) associated with compounds. Intended users of the technique are the research community and pharmaceutical industry, for which it will provide a new tool to better identify novel chemical structural neighbors to aid drug discovery.
一种用于注释、查询和分析化合物的新技术已经开发出来,并通过使用HIV蛋白酶-抑制剂复合物的抑制剂数据进行了说明。在这种方法中,所有化合物都根据标准化学结构片段进行注释。这些标准片段是通过使用化学分类、结构、化学或官能团以及商业、科学或通用名称或同义词等标准来定义的。然后,这些片段根据其化学子结构被组织成一个数据树。已经开发了搜索引擎来使用这个数据树,以便对HIV蛋白酶抑制剂进行查询(http://xpdb.nist.gov/hivsdb/hivsdb.html)。这些搜索引擎使用一种新的技术,即子结构化学块分层比对技术(Chem-BLAST),在抑制剂的片段上进行搜索,以寻找其化学结构上的邻域。这种注释和查询化合物的新技术为在化合物上使用语义网概念奠定了基础,使终端用户能够准确、高效地对结构邻域进行分组、排序和搜索。在注释过程中,它能够以一种远远超过当前通过创建与化合物相关的知识库(或本体)将“元数据”与数据关联的方式,为数据附加“意义”(即语义)。该技术的目标用户是研究界和制药行业,它将为他们提供一种新工具,以更好地识别新型化学结构邻域,辅助药物发现。