• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多变量表型分析使哺乳动物基因功能的全基因组推断成为可能。

Multivariate phenotype analysis enables genome-wide inference of mammalian gene function.

机构信息

University of Oxford, Oxford, United Kingdom.

MRC Harwell Institute, Harwell, United Kingdom.

出版信息

PLoS Biol. 2022 Aug 9;20(8):e3001723. doi: 10.1371/journal.pbio.3001723. eCollection 2022 Aug.

DOI:10.1371/journal.pbio.3001723
PMID:35944064
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9391051/
Abstract

The function of the majority of genes in the human and mouse genomes is unknown. Investigating and illuminating this dark genome is a major challenge for the biomedical sciences. The International Mouse Phenotyping Consortium (IMPC) is addressing this through the generation and broad-based phenotyping of a knockout (KO) mouse line for every protein-coding gene, producing a multidimensional data set that underlies a genome-wide annotation map from genes to phenotypes. Here, we develop a multivariate (MV) statistical approach and apply it to IMPC data comprising 148 phenotypes measured across 4,548 KO lines. There are 4,256 (1.4% of 302,997 observed data measurements) hits called by the univariate (UV) model analysing each phenotype separately, compared to 31,843 (10.5%) hits in the observed data results of the MV model, corresponding to an estimated 7.5-fold increase in power of the MV model relative to the UV model. One key property of the data set is its 55.0% rate of missingness, resulting from quality control filters and incomplete measurement of some KO lines. This raises the question of whether it is possible to infer perturbations at phenotype-gene pairs at which data are not available, i.e., to infer some in vivo effects using statistical analysis rather than experimentation. We demonstrate that, even at missing phenotypes, the MV model can detect perturbations with power comparable to the single-phenotype analysis, thereby filling in the complete gene-phenotype map with good sensitivity. A factor analysis of the MV model's fitted covariance structure identifies 20 clusters of phenotypes, with each cluster tending to be perturbed collectively. These factors cumulatively explain 75% of the KO-induced variation in the data and facilitate biological interpretation of perturbations. We also demonstrate that the MV approach strengthens the correspondence between IMPC phenotypes and existing gene annotation databases. Analysis of a subset of KO lines measured in replicate across multiple laboratories confirms that the MV model increases power with high replicability.

摘要

人类和小鼠基因组中大多数基因的功能未知。探索和阐明这个“暗基因组”是生物医学科学面临的一个重大挑战。国际小鼠表型分析联盟(IMPC)正在通过生成和广泛表型分析每个编码蛋白基因的 KO 小鼠系来应对这一挑战,从而产生一个多维数据集,该数据集是从基因到表型的全基因组注释图谱的基础。在这里,我们开发了一种多变量(MV)统计方法,并将其应用于 IMPC 数据,这些数据包括在 4548 条 KO 系中测量的 148 种表型。有 4256 个(302997 个观察数据测量值的 1.4%)通过分别分析每种表型的单变量(UV)模型调用的命中,而 MV 模型的观察数据结果中有 31843 个(10.5%)命中,这对应于 MV 模型相对于 UV 模型的功率估计增加了 7.5 倍。数据集的一个关键特性是其 55.0%的缺失率,这是由于质量控制过滤器和某些 KO 系的不完全测量造成的。这就提出了一个问题,即是否有可能推断出数据不可用的表型-基因对中的扰动,也就是说,使用统计分析而不是实验来推断一些体内效应。我们证明,即使在缺失表型的情况下,MV 模型也可以以与单表型分析相当的功率检测到扰动,从而以良好的灵敏度填补完整的基因-表型图谱。MV 模型拟合协方差结构的因子分析确定了 20 个表型簇,每个簇倾向于集体受到干扰。这些因子累计解释了数据中 75%的 KO 诱导变异,并有助于对扰动进行生物学解释。我们还证明,MV 方法增强了 IMPC 表型与现有基因注释数据库之间的对应关系。在多个实验室中重复测量的 KO 系子集的分析证实,MV 模型的功率随着可重复性的提高而增加。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/dc58d4baa066/pbio.3001723.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/0b0e5cc63ef7/pbio.3001723.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/2aaeeaea0b89/pbio.3001723.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/bd7acb01ba68/pbio.3001723.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/20cda58609db/pbio.3001723.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/5d7a3052e1cd/pbio.3001723.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/d4d3229f0f54/pbio.3001723.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/7d85a29e0187/pbio.3001723.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/cdf569afdfdb/pbio.3001723.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/e28fd9bd54c6/pbio.3001723.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/dc58d4baa066/pbio.3001723.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/0b0e5cc63ef7/pbio.3001723.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/2aaeeaea0b89/pbio.3001723.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/bd7acb01ba68/pbio.3001723.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/20cda58609db/pbio.3001723.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/5d7a3052e1cd/pbio.3001723.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/d4d3229f0f54/pbio.3001723.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/7d85a29e0187/pbio.3001723.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/cdf569afdfdb/pbio.3001723.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/e28fd9bd54c6/pbio.3001723.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c86c/9391051/dc58d4baa066/pbio.3001723.g010.jpg

相似文献

1
Multivariate phenotype analysis enables genome-wide inference of mammalian gene function.多变量表型分析使哺乳动物基因功能的全基因组推断成为可能。
PLoS Biol. 2022 Aug 9;20(8):e3001723. doi: 10.1371/journal.pbio.3001723. eCollection 2022 Aug.
2
Commentary: The International Mouse Phenotyping Consortium: high-throughput in vivo functional annotation of the mammalian genome.评论:国际小鼠表型分析联盟:哺乳动物基因组的高通量体内功能注释。
Mamm Genome. 2024 Dec;35(4):537-543. doi: 10.1007/s00335-024-10068-x. Epub 2024 Sep 10.
3
The International Mouse Phenotyping Consortium: comprehensive knockout phenotyping underpinning the study of human disease.国际小鼠表型分析联盟:全面的基因敲除表型分析为人类疾病研究提供支撑。
Nucleic Acids Res. 2023 Jan 6;51(D1):D1038-D1045. doi: 10.1093/nar/gkac972.
4
The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data.国际小鼠表型分析联盟网站门户,是用于获取基因敲除小鼠及其相关表型数据的统一入口。
Nucleic Acids Res. 2014 Jan;42(Database issue):D802-9. doi: 10.1093/nar/gkt977. Epub 2013 Nov 4.
5
Analysis of genome-wide knockout mouse database identifies candidate ciliopathy genes.全基因组敲除小鼠数据库分析鉴定潜在的纤毛病候选基因。
Sci Rep. 2022 Dec 1;12(1):20791. doi: 10.1038/s41598-022-19710-7.
6
Automated pipeline for anatomical phenotyping of mouse embryos using micro-CT.使用微型计算机断层扫描对小鼠胚胎进行解剖表型分析的自动化流程
Development. 2014 Jun;141(12):2533-41. doi: 10.1242/dev.107722. Epub 2014 May 21.
7
Genome-wide screening reveals the genetic basis of mammalian embryonic eye development.全基因组筛查揭示了哺乳动物胚胎眼睛发育的遗传基础。
BMC Biol. 2023 Feb 3;21(1):22. doi: 10.1186/s12915-022-01475-0.
8
KOMPUTE: imputing summary statistics of missing phenotypes in high-throughput model organism data.KOMPUTE:推算高通量模式生物数据中缺失表型的汇总统计量。
Bioinform Adv. 2023 Aug 1;3(1):vbad100. doi: 10.1093/bioadv/vbad100. eCollection 2023.
9
Erratum: High-Throughput Identification of Resistance to Pseudomonas syringae pv. Tomato in Tomato using Seedling Flood Assay.勘误:利用幼苗浸没法高通量鉴定番茄对丁香假单胞菌 pv.番茄的抗性。
J Vis Exp. 2023 Oct 18(200). doi: 10.3791/6576.
10
The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation.国际小鼠表型分析联盟(IMPC):一份为保护工作提供信息的哺乳动物基因组功能目录。
Conserv Genet. 2018;19(4):995-1005. doi: 10.1007/s10592-018-1072-9. Epub 2018 May 19.

本文引用的文献

1
The Gene Ontology resource: enriching a GOld mine.基因本体论资源:丰富一个 GOld 矿。
Nucleic Acids Res. 2021 Jan 8;49(D1):D325-D334. doi: 10.1093/nar/gkaa1113.
2
Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions.具有多种条件的基因组研究中估计和检验效应的灵活统计方法。
Nat Genet. 2019 Jan;51(1):187-195. doi: 10.1038/s41588-018-0268-8. Epub 2018 Nov 26.
3
High-throughput mouse phenomics for characterizing mammalian gene function.高通量小鼠表型分析用于鉴定哺乳动物基因功能。
Nat Rev Genet. 2018 Jun;19(6):357-370. doi: 10.1038/s41576-018-0005-2.
4
Unexplored therapeutic opportunities in the human genome.人类基因组中尚未被探索的治疗机会。
Nat Rev Drug Discov. 2018 May;17(5):317-332. doi: 10.1038/nrd.2018.14. Epub 2018 Mar 23.
5
Reproducibility and replicability of rodent phenotyping in preclinical studies.啮齿动物表型在临床前研究中的可重复性和可复制性。
Neurosci Biobehav Rev. 2018 Apr;87:218-232. doi: 10.1016/j.neubiorev.2018.01.003. Epub 2018 Jan 31.
6
Identification of genetic elements in metabolism by high-throughput mouse phenotyping.通过高通量小鼠表型分析鉴定代谢中的遗传元件。
Nat Commun. 2018 Jan 18;9(1):288. doi: 10.1038/s41467-017-01995-2.
7
An empirical Bayes approach for multiple tissue eQTL analysis.一种用于多组织eQTL分析的经验贝叶斯方法。
Biostatistics. 2018 Jul 1;19(3):391-406. doi: 10.1093/biostatistics/kxx048.
8
A large scale hearing loss screen reveals an extensive unexplored genetic landscape for auditory dysfunction.一项大规模听力损失筛查揭示了听觉功能障碍方面广泛未被探索的基因图谱。
Nat Commun. 2017 Oct 12;8(1):886. doi: 10.1038/s41467-017-00595-4.
9
Prevalence of sexual dimorphism in mammalian phenotypic traits.哺乳动物表型特征中存在性二态性的普遍性。
Nat Commun. 2017 Jun 26;8:15475. doi: 10.1038/ncomms15475.
10
Disease model discovery from 3,328 gene knockouts by The International Mouse Phenotyping Consortium.国际小鼠表型分析联盟从3328个基因敲除实验中发现疾病模型。
Nat Genet. 2017 Aug;49(8):1231-1238. doi: 10.1038/ng.3901. Epub 2017 Jun 26.