Acc Chem Res. 2023 Feb 21;56(4):489-499. doi: 10.1021/acs.accounts.2c00791. Epub 2023 Feb 9.
The Human Genome Project ultimately aimed to translate DNA sequence into drugs. With the draft in hand, the Molecular Libraries Program set out to prosecute all genome-encoded proteins for drug discovery with automated high-throughput screening (HTS). This ambitious vision remains unfulfilled, even while innovations in sequencing technology have fully democratized access to genome-scale sequencing. Why? While the central dogma of biology allows us to chart the entirety of cellular metabolism through sequencing, there is no direct coding for chemistry. The rules of base pairing that relate DNA gene to RNA transcript and amino acid sequence do not exist for relating small-molecule structure with macromolecular binding partners and subsequently cellular function. Obtaining such relationships genome-wide is unapproachable via state-of-the-art HTS, akin to attempting genome-wide association studies using turn-of-the-millennium Sanger DNA sequencing.Our laboratory has been engaged in a multipronged technology development campaign to revolutionize molecular screening through miniaturization in pursuit of genome-scale drug discovery capabilities. The compound library was ripe for miniaturization: it clearly needed to become a consumable. We employed DNA-encoded library (DEL) synthesis principles in the development of solid-phase DELs prepared on microscopic beads, each harboring 100 fmol of a single library member and a DNA tag whose sequence describes the structure of the library member. Loading these DEL beads into 100 pL microfluidic droplets followed by online photocleavage, incubation, fluorescence-activated droplet sorting, and DNA sequencing of the sorted DEL beads reveals the chemical structures of bioactive compounds. This scalable library synthesis and screening platform has proven useful in several proof-of-concept projects involving current clinical targets.Moving forward, we face the problem of druggability and proteome-scale assay development. Developing biochemical or cellular assays for all genome-encoded targets is not scalable and likely impossible as most proteins have ill-defined or unknown activity and may not function outside of their native contexts. These are the dark undruggable expanses, and charting them will require advanced synthesis and analytical technologies that can generalize probe discovery, irrespective of mature protein function, to fulfill the Genome Project's vision of proteome-wide control of cellular pharmacology.
人类基因组计划最终旨在将 DNA 序列转化为药物。有了草案,分子文库计划着手对所有基因组编码蛋白进行药物发现的自动化高通量筛选 (HTS)。尽管测序技术的创新已经完全普及了基因组规模测序的访问,但这个雄心勃勃的愿景仍然没有实现。为什么?虽然生物学的中心法则允许我们通过测序来绘制整个细胞代谢图谱,但化学物质并没有直接编码。将小分子结构与大分子结合伙伴和随后的细胞功能相关联的碱基配对规则并不存在于 DNA 基因与 RNA 转录本和氨基酸序列相关联的规则中。通过最先进的 HTS 获得全基因组范围内的这种关系是无法实现的,就像试图使用千禧年桑格 DNA 测序进行全基因组关联研究一样。我们的实验室一直在进行多方面的技术开发活动,通过微型化来彻底改变分子筛选,以追求基因组规模的药物发现能力。化合物库已经成熟,可以进行微型化:它显然需要成为一种消耗品。我们在固相 DEL 的开发中采用了 DNA 编码库 (DEL) 合成原理,该库制备在微小珠子上,每个珠子上都有 100 fmol 的单个库成员和一个 DNA 标记,其序列描述了库成员的结构。将这些 DEL 珠子装入 100 pL 微流控液滴中,然后进行在线光裂解脱附、孵育、荧光激活液滴分拣,以及分拣后的 DEL 珠子的 DNA 测序,揭示了生物活性化合物的化学结构。这种可扩展的库合成和筛选平台已在涉及当前临床靶点的几个概念验证项目中证明是有用的。展望未来,我们面临着可成药性和蛋白质组规模测定方法的开发问题。针对所有基因组编码的靶点开发生化或细胞测定方法是不可扩展的,而且可能是不可能的,因为大多数蛋白质的活性定义不明确或未知,并且可能不在其天然环境中发挥作用。这些是黑暗的不可成药的广阔领域,要绘制它们需要先进的合成和分析技术,可以推广探针发现,而不管成熟蛋白质的功能如何,以实现基因组计划对细胞药理学的蛋白质组范围控制的愿景。