• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

社区资源:大规模蛋白质基因组学完善小麦基因组注释。

Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations.

机构信息

Independent Researcher, Melbourne, VIC 3000, Australia.

Faculty of Science, University of Melbourne, Parkville, VIC 3010, Australia.

出版信息

Int J Mol Sci. 2024 Aug 7;25(16):8614. doi: 10.3390/ijms25168614.

DOI:10.3390/ijms25168614
PMID:39201310
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11354340/
Abstract

is an important crop whose reference genome (International Wheat Genome Sequencing Consortium (IWGSC) RefSeq v2.1) offers a valuable resource for understanding wheat genetic structure, improving agronomic traits, and developing new cultivars. A key aspect of gene model annotation is protein-level evidence of gene expression obtained from proteomics studies, followed up by proteogenomics to physically map proteins to the genome. In this research, we have retrieved the largest recent wheat proteomics datasets publicly available and applied the Basic Local Alignment Search Tool (tBLASTn) algorithm to map the 861,759 identified unique peptides against IWGSC RefSeq v2.1. Of the 92,719 hits, 83,015 unique peptides aligned along 33,612 High Confidence (HC) genes, thus validating 31.4% of all wheat HC gene models. Furthermore, 6685 unique peptides were mapped against 3702 Low Confidence (LC) gene models, and we argue that these gene models should be considered for HC status. The remaining 2934 orphan peptides can be used for novel gene discovery, as exemplified here on chromosome 4D. We demonstrated that tBLASTn could not map peptides exhibiting mid-sequence frame shift. We supply all our proteogenomics results, Galaxy workflow and Python code, as well as Browser Extensible Data (BED) files as a resource for the wheat community via the Apollo Jbrowse, and GitHub repositories. Our workflow could be applied to other proteomics datasets to expand this resource with proteins and peptides from biotically and abiotically stressed samples. This would help tease out wheat gene expression under various environmental conditions, both spatially and temporally.

摘要

是一种重要的作物,其参考基因组(国际小麦基因组测序联盟(IWGSC)RefSeq v2.1)为了解小麦遗传结构、改良农艺性状和培育新品种提供了宝贵的资源。基因模型注释的一个关键方面是从蛋白质组学研究中获得的蛋白质水平的基因表达证据,随后通过蛋白质基因组学将蛋白质物理映射到基因组上。在这项研究中,我们检索了最近公开的最大的小麦蛋白质组学数据集,并应用基本局部比对搜索工具(tBLASTn)算法将 861,759 个鉴定的独特肽映射到 IWGSC RefSeq v2.1。在 92,719 个命中中,83,015 个独特肽沿着 33,612 个高可信度(HC)基因对齐,从而验证了所有小麦 HC 基因模型的 31.4%。此外,6685 个独特肽被映射到 3702 个低可信度(LC)基因模型上,我们认为这些基因模型应该被考虑为 HC 状态。其余的 2934 个孤儿肽可用于新基因的发现,这里在 4D 染色体上举例说明了这一点。我们证明 tBLASTn 无法映射具有中间序列移码的肽。我们通过 Apollo Jbrowse 和 GitHub 存储库,为小麦社区提供了所有的蛋白质基因组学结果、Galaxy 工作流程和 Python 代码,以及 Browser Extensible Data(BED)文件。我们的工作流程可以应用于其他蛋白质组学数据集,以从生物和非生物胁迫的样本中扩展该资源的蛋白质和肽。这将有助于梳理小麦在各种环境条件下的基因表达,包括空间和时间上的表达。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/3872e5f0357e/ijms-25-08614-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/40016d2d4dc5/ijms-25-08614-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/041ec82fa096/ijms-25-08614-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/2f82c49c2edc/ijms-25-08614-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/0dde4db7fb51/ijms-25-08614-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/3872e5f0357e/ijms-25-08614-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/40016d2d4dc5/ijms-25-08614-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/041ec82fa096/ijms-25-08614-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/2f82c49c2edc/ijms-25-08614-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/0dde4db7fb51/ijms-25-08614-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3375/11354340/3872e5f0357e/ijms-25-08614-g005.jpg

相似文献

1
Community Resource: Large-Scale Proteogenomics to Refine Wheat Genome Annotations.社区资源:大规模蛋白质基因组学完善小麦基因组注释。
Int J Mol Sci. 2024 Aug 7;25(16):8614. doi: 10.3390/ijms25168614.
2
Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes.Peptimapper:用于真核生物基因组专家注释的蛋白质基因组学工作流程。
BMC Genomics. 2019 Jan 17;20(1):56. doi: 10.1186/s12864-019-5431-9.
3
Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly.光学图谱精修小麦中国春品种基因组组装。
Plant J. 2021 Jul;107(1):303-314. doi: 10.1111/tpj.15289. Epub 2021 May 16.
4
Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.利用高严格性的蛋白质基因组学工作流程改进 GENCODE 参考基因注释。
Nat Commun. 2016 Jun 2;7:11778. doi: 10.1038/ncomms11778.
5
Genome-wide identification and expression analysis of expansin gene family in common wheat (Triticum aestivum L.).小麦(Triticum aestivum L.)扩展蛋白基因家族的全基因组鉴定和表达分析。
BMC Genomics. 2019 Feb 1;20(1):101. doi: 10.1186/s12864-019-5455-1.
6
Proteogenomic Gene Structure Validation in the Pineapple Genome.菠萝基因组中的蛋白质基因组基因结构验证
J Proteome Res. 2024 May 3;23(5):1583-1592. doi: 10.1021/acs.jproteome.3c00675. Epub 2024 Apr 23.
7
Coverage and consistency: bioinformatics aspects of the analysis of multirun iTRAQ experiments with wheat leaves.覆盖范围和一致性:分析小麦叶片多重复 iTRAQ 实验的生物信息学方面。
J Proteome Res. 2013 Nov 1;12(11):4870-81. doi: 10.1021/pr400531y. Epub 2013 Sep 20.
8
Improving Silkworm Genome Annotation Using a Proteogenomics Approach.利用蛋白质基因组学方法提高家蚕基因组注释质量。
J Proteome Res. 2019 Aug 2;18(8):3009-3019. doi: 10.1021/acs.jproteome.8b00965. Epub 2019 Jul 2.
9
Chromosome-Scale Assembly of the Bread Wheat Genome Reveals Thousands of Additional Gene Copies.染色体水平组装的小麦基因组揭示了数千个额外的基因拷贝。
Genetics. 2020 Oct;216(2):599-608. doi: 10.1534/genetics.120.303501. Epub 2020 Aug 12.
10
Plant Proteogenomics: Improvements to the Grapevine Genome Annotation.植物蛋白基因组学:提高葡萄基因组注释质量。
Proteomics. 2017 Nov;17(21). doi: 10.1002/pmic.201700197. Epub 2017 Oct 13.

本文引用的文献

1
The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update.Galaxy 平台,用于可访问、可重现和协作的数据分析:2024 年更新。
Nucleic Acids Res. 2024 Jul 5;52(W1):W83-W94. doi: 10.1093/nar/gkae410.
2
A community resource to mass explore the wheat grain proteome and its application to the late-maturity alpha-amylase (LMA) problem.一种用于大规模探索小麦谷蛋白组的社区资源及其在晚熟α-淀粉酶(LMA)问题上的应用。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad084. Epub 2023 Nov 1.
3
Proteogenomics-based functional genome research: approaches, applications, and perspectives in plants.
基于蛋白质基因组学的功能基因组研究:在植物中的方法、应用和展望。
Trends Biotechnol. 2023 Dec;41(12):1532-1548. doi: 10.1016/j.tibtech.2023.05.010. Epub 2023 Jun 24.
4
Near- to long-term measures to stabilize global wheat supplies and food security.稳定全球小麦供应和粮食安全的近期至长期措施。
Nat Food. 2022 Jul;3(7):483-486. doi: 10.1038/s43016-022-00559-y.
5
Capturing Wheat Phenotypes at the Genome Level.在基因组水平上捕获小麦表型
Front Plant Sci. 2022 Jul 4;13:851079. doi: 10.3389/fpls.2022.851079. eCollection 2022.
6
Mining the Wheat Grain Proteome.从麦类作物籽粒中发掘蛋白质组。
Int J Mol Sci. 2022 Jan 10;23(2):713. doi: 10.3390/ijms23020713.
7
Optical maps refine the bread wheat Triticum aestivum cv. Chinese Spring genome assembly.光学图谱精修小麦中国春品种基因组组装。
Plant J. 2021 Jul;107(1):303-314. doi: 10.1111/tpj.15289. Epub 2021 May 16.
8
Chromosome-Scale Assembly of the Bread Wheat Genome Reveals Thousands of Additional Gene Copies.染色体水平组装的小麦基因组揭示了数千个额外的基因拷贝。
Genetics. 2020 Oct;216(2):599-608. doi: 10.1534/genetics.120.303501. Epub 2020 Aug 12.
9
A Critical Review of Bottom-Up Proteomics: The Good, the Bad, and the Future of this Field.自下而上蛋白质组学的批判性综述:该领域的优势、不足与未来
Proteomes. 2020 Jul 6;8(3):14. doi: 10.3390/proteomes8030014.
10
The Battle to Sequence the Bread Wheat Genome: A Tale of the Three Kingdoms.《测序普通小麦基因组:三国鼎立的故事》
Genomics Proteomics Bioinformatics. 2020 Jun;18(3):221-229. doi: 10.1016/j.gpb.2019.09.005. Epub 2020 Jun 17.