Pritchard Justin R, Bruno Peter M, Hemann Michael T, Lauffenburger Douglas A
Department of Biology M.I.T., Cambridge, MA, USA.
Mol Biosyst. 2013 Jul;9(7):1604-19. doi: 10.1039/c2mb25459j. Epub 2013 Jan 4.
Molecular signatures are a powerful approach to characterize novel small molecules and derivatized small molecule libraries. While new experimental techniques are being developed in diverse model systems, informatics approaches lag behind these exciting advances. We propose an analysis pipeline for signature based drug annotation. We develop an integrated strategy, utilizing supervised and unsupervised learning methodologies that are bridged by network based statistics. Using this approach we can: 1, predict new examples of drug mechanisms that we trained our model upon; 2, identify "New" mechanisms of action that do not belong to drug categories that our model was trained upon; and 3, update our training sets with these "New" mechanisms and accurately predict entirely distinct examples from these new categories. Thus, not only does our strategy provide statistical generalization but it also offers biological generalization. Additionally, we show that our approach is applicable to diverse types of data, and that distinct biological mechanisms characterize its resolution of categories across different data types. As particular examples, we find that our predictive resolution of drug mechanisms from mRNA expression studies relies upon the analog measurement of a cell stress-related transcriptional rheostat along with a transcriptional representation of cell cycle state; whereas, in contrast, drug mechanism resolution from functional RNAi studies rely upon more dichotomous (e.g., either enhances or inhibits) association with cell death states. We believe that our approach can facilitate molecular signature-based drug mechanism understanding from different technology platforms and across diverse biological phenomena.
分子特征是表征新型小分子和衍生小分子文库的有力方法。尽管在各种模型系统中正在开发新的实验技术,但信息学方法却落后于这些令人兴奋的进展。我们提出了一种基于特征的药物注释分析流程。我们开发了一种综合策略,利用基于网络统计连接的监督学习和无监督学习方法。使用这种方法,我们可以:1. 预测我们训练模型所依据的药物作用机制的新实例;2. 识别不属于我们训练模型所依据的药物类别的“新”作用机制;3. 用这些“新”机制更新我们的训练集,并准确预测来自这些新类别的完全不同的实例。因此,我们的策略不仅提供了统计上的泛化,还提供了生物学上的泛化。此外,我们表明我们的方法适用于不同类型的数据,并且不同的生物学机制表征了其在不同数据类型上对类别的分辨能力。作为具体例子,我们发现我们从mRNA表达研究中对药物作用机制的预测分辨率依赖于细胞应激相关转录调节子的类似测量以及细胞周期状态的转录表示;而相比之下,从功能性RNAi研究中对药物作用机制的分辨率依赖于与细胞死亡状态更二分法的(例如,增强或抑制)关联。我们相信我们的方法可以促进从不同技术平台以及跨越各种生物学现象的基于分子特征的药物作用机制理解。