Lal Avantika, Gunsalus Laura, Gupta Anay, Biancalani Tommaso, Eraslan Gokcen
Biology Research | AI Development, gRED Computational Sciences, Genentech, South San Francisco, USA.
College of Computing, Georgia Institute of Technology, Atlanta, GA, USA.
Genome Biol. 2025 May 6;26(1):114. doi: 10.1186/s13059-025-03584-9.
The design of regulatory elements is pivotal in gene and cell therapy, where DNA sequences are engineered to drive elevated and cell-type specific expression. However, the systematic assessment of synthetic DNA sequences without robust metrics and easy-to-use software remains challenging. Here, we introduce Polygraph, a Python framework that evaluates synthetic DNA elements, based on features like diversity, motif and k-mer composition, similarity to endogenous sequences, and screening with predictive and foundational models. Polygraph is the first instrument for assessing synthetic regulatory sequences, enabling faster progress in therapeutic interventions and improving our understanding of gene regulatory mechanisms.
调控元件的设计在基因和细胞治疗中至关重要,在基因和细胞治疗中,DNA序列经过工程改造以驱动高水平和细胞类型特异性表达。然而,在没有强大的指标和易于使用的软件的情况下,对合成DNA序列进行系统评估仍然具有挑战性。在这里,我们介绍了Polygraph,这是一个基于多样性、基序和k-mer组成、与内源性序列的相似性以及用预测模型和基础模型进行筛选等特征来评估合成DNA元件的Python框架。Polygraph是第一个用于评估合成调控序列的工具,它能够在治疗干预方面取得更快的进展,并增进我们对基因调控机制的理解。