MRC Laboratory of Molecular Biology, Cambridge, United Kingdom.
Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom.
PLoS Biol. 2023 Aug 8;21(8):e3002222. doi: 10.1371/journal.pbio.3002222. eCollection 2023 Aug.
The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable "Unknome database" that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.
人类基因组大约编码 20000 种蛋白质,其中许多仍未被阐明。很明显,科学研究往往集中在研究充分的蛋白质上,因此人们担心那些了解甚少的基因被不合理地忽视了。为了解决这个问题,我们开发了一个公开的、可定制的“Unknome 数据库”,根据人们对蛋白质的了解程度对其进行排名。我们在果蝇中使用 RNA 干扰(RNAi)技术对 260 个在果蝇和人类之间保守的未知基因进行了敲低。一些基因的敲低导致了生存力的丧失,而其余基因的功能筛选则发现了与生育力、发育、运动、蛋白质质量控制和应激抗性相关的基因。CRISPR/Cas9 基因敲除验证了 Notch 信号通路的一个组成部分和 2 个导致雄性生育力丧失的基因。我们的工作说明了了解甚少的基因的重要性,提供了一个加速未来研究的资源,并强调了需要支持数据库管理,以确保错误注释不会削弱我们对自身无知的认识。