Shi L M, Fan Y, Lee J K, Waltham M, Andrews D T, Scherf U, Paull K D, Weinstein J N
Laboratory of Molecular Pharmacology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892-4255, USA.
J Chem Inf Comput Sci. 2000 Mar-Apr;40(2):367-79. doi: 10.1021/ci990087b.
In order to find more effective anticancer drugs, the U.S. National Cancer Institute (NCI) screens a large number of compounds in vitro against 60 human cancer cell lines from different organs of origin. About 70,000 compounds have been tested in the program since 1990, and each tested compound can be characterized by a vector (i.e., "fingerprint") of 60 anticancer activity, or -[log(GI50)], values. GI50 is the concentration required to inhibit cell growth by 50% compared with untreated controls. Although cell growth inhibitory activity for a single cell line is not very informative, activity patterns across the 60 cell lines can provide incisive information on the mechanisms of action of screened compounds and also on molecular targets and modulators of activity within the cancer cells. Various statistical and artificial intelligence methods, including principal component analysis, hierarchical cluster analysis, stepwise linear regression, multidimensional scaling, neural network modeling, and genetic function approximation, among others, can be used to analyze this large activity database. Mining the database can provide useful information: (a) for the development of anticancer drugs; (b) for a better understanding of the molecular pharmacology of cancer; and (c) for improvement of the drug discovery process.
为了找到更有效的抗癌药物,美国国立癌症研究所(NCI)在体外针对来自不同器官起源的60种人类癌细胞系筛选了大量化合物。自1990年以来,该项目已测试了约70000种化合物,每种测试化合物都可以用一个由60种抗癌活性或-[log(GI50)]值组成的向量(即“指纹”)来表征。GI50是与未处理对照相比抑制细胞生长50%所需的浓度。尽管单一细胞系的细胞生长抑制活性信息不太丰富,但60种细胞系的活性模式可以提供有关筛选化合物作用机制以及癌细胞内分子靶点和活性调节剂的深刻信息。包括主成分分析、层次聚类分析、逐步线性回归、多维标度法、神经网络建模和遗传函数逼近等各种统计和人工智能方法,可用于分析这个庞大的活性数据库。挖掘该数据库可提供有用信息:(a)用于抗癌药物的开发;(b)用于更好地理解癌症的分子药理学;(c)用于改进药物发现过程。