• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ORFograph:在基因组和宏基因组组装图中搜索新型杀虫蛋白基因。

ORFograph: search for novel insecticidal protein genes in genomic and metagenomic assembly graphs.

机构信息

Center for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, Russia.

Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA.

出版信息

Microbiome. 2021 Jun 28;9(1):149. doi: 10.1186/s40168-021-01092-z.

DOI:10.1186/s40168-021-01092-z
PMID:34183047
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8240309/
Abstract

BACKGROUND

Since the prolonged use of insecticidal proteins has led to toxin resistance, it is important to search for novel insecticidal protein genes (IPGs) that are effective in controlling resistant insect populations. IPGs are usually encoded in the genomes of entomopathogenic bacteria, especially in large plasmids in strains of the ubiquitous soil bacteria, Bacillus thuringiensis (Bt). Since there are often multiple similar IPGs encoded by such plasmids, their assemblies are typically fragmented and many IPGs are scattered through multiple contigs. As a result, existing gene prediction tools (that analyze individual contigs) typically predict partial rather than complete IPGs, making it difficult to conduct downstream IPG engineering efforts in agricultural genomics.

METHODS

Although it is difficult to assemble IPGs in a single contig, the structure of the genome assembly graph often provides clues on how to combine multiple contigs into segments encoding a single IPG.

RESULTS

We describe ORFograph, a pipeline for predicting IPGs in assembly graphs, benchmark it on (meta)genomic datasets, and discover nearly a hundred novel IPGs. This work shows that graph-aware gene prediction tools enable the discovery of greater diversity of IPGs from (meta)genomes.

CONCLUSIONS

We demonstrated that analysis of the assembly graphs reveals novel candidate IPGs. ORFograph identified both already known genes "hidden" in assembly graphs and potential novel IPGs that evaded existing tools for IPG identification. As ORFograph is fast, one could imagine a pipeline that processes many (meta)genomic assembly graphs to identify even more novel IPGs for phenotypic testing than would previously be inaccessible by traditional gene-finding methods. While here we demonstrated the results of ORFograph only for IPGs, the proposed approach can be generalized to any class of genes. Video abstract.

摘要

背景

由于杀虫剂蛋白的长期使用导致了抗药性,因此寻找新型杀虫蛋白基因(IPG)以有效控制抗性昆虫种群非常重要。IPG 通常编码在昆虫病原细菌的基因组中,特别是在无处不在的土壤细菌苏云金芽孢杆菌(Bt)菌株的大型质粒中。由于这些质粒通常编码多个类似的 IPG,因此它们的组装通常是碎片化的,许多 IPG 分散在多个 contigs 中。因此,现有的基因预测工具(分析单个 contigs)通常预测部分而不是完整的 IPG,这使得在农业基因组学中难以进行下游的 IPG 工程工作。

方法

尽管很难将 IPG 组装成单个 contig,但基因组组装图的结构通常提供了如何将多个 contig 组合成编码单个 IPG 的片段的线索。

结果

我们描述了 ORFograph,这是一种用于在组装图中预测 IPG 的流水线,在(宏)基因组数据集上对其进行了基准测试,并发现了近一百个新的 IPG。这项工作表明,基于图的基因预测工具能够从(宏)基因组中发现更多种类的 IPG。

结论

我们证明了对组装图的分析揭示了新的候选 IPG。ORFograph 不仅识别了已经隐藏在组装图中的基因,还识别了逃避现有 IPG 识别工具的潜在新的 IPG。由于 ORFograph 速度很快,人们可以想象一个处理许多(宏)基因组组装图的流水线,以比传统基因发现方法以前无法访问的方式识别更多用于表型测试的新型 IPG。虽然这里我们仅展示了 ORFograph 在 IPG 方面的结果,但所提出的方法可以推广到任何一类基因。视频摘要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/ba8086f3e204/40168_2021_1092_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/75392839ada5/40168_2021_1092_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/265042cd14b1/40168_2021_1092_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/6f4d14116abf/40168_2021_1092_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/ed6daac5935d/40168_2021_1092_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/906981835751/40168_2021_1092_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/3a059441b7af/40168_2021_1092_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/6d4a5637b5ca/40168_2021_1092_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/f79b51ea6348/40168_2021_1092_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/ba8086f3e204/40168_2021_1092_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/75392839ada5/40168_2021_1092_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/265042cd14b1/40168_2021_1092_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/6f4d14116abf/40168_2021_1092_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/ed6daac5935d/40168_2021_1092_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/906981835751/40168_2021_1092_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/3a059441b7af/40168_2021_1092_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/6d4a5637b5ca/40168_2021_1092_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/f79b51ea6348/40168_2021_1092_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d06/8240309/ba8086f3e204/40168_2021_1092_Fig9_HTML.jpg

相似文献

1
ORFograph: search for novel insecticidal protein genes in genomic and metagenomic assembly graphs.ORFograph:在基因组和宏基因组组装图中搜索新型杀虫蛋白基因。
Microbiome. 2021 Jun 28;9(1):149. doi: 10.1186/s40168-021-01092-z.
2
BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs.BiosyntheticSPAdes:从组装图重建生物合成基因簇。
Genome Res. 2019 Aug;29(8):1352-1362. doi: 10.1101/gr.243477.118. Epub 2019 Jun 3.
3
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
4
Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs.基于组成、覆盖度和组装图对宏基因组序列进行精确分箱。
J Comput Biol. 2022 Dec;29(12):1357-1376. doi: 10.1089/cmb.2022.0262. Epub 2022 Nov 11.
5
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
6
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.METAMVGL:一种基于多视图图的宏基因组序列拼接 bin 算法,通过整合组装图和配对末端图。
BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4.
7
Stitching gene fragments with a network matching algorithm improves gene assembly for metagenomics.利用网络匹配算法拼接基因片段可提高宏基因组基因组装质量。
Bioinformatics. 2012 Sep 15;28(18):i363-i369. doi: 10.1093/bioinformatics/bts388.
8
3CAC: improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs.3CAC:利用组装图提高宏基因组组装中噬菌体和质粒的分类。
Bioinformatics. 2022 Sep 16;38(Suppl_2):ii56-ii61. doi: 10.1093/bioinformatics/btac468.
9
Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.用于下一代测序的图挖掘:利用组装图获取生物学见解。
BMC Genomics. 2016 May 6;17:340. doi: 10.1186/s12864-016-2678-2.
10
Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins.罗盘箱:利用支架提高宏基因组分箱的连续性和质量
Front Microbiol. 2021 Feb 24;12:638561. doi: 10.3389/fmicb.2021.638561. eCollection 2021.

引用本文的文献

1
Strategies and tools in illumina and nanopore-integrated metagenomic analysis of microbiome data.微生物组数据的Illumina和纳米孔整合宏基因组分析中的策略与工具
Imeta. 2023 Jan 9;2(1):e72. doi: 10.1002/imt2.72. eCollection 2023 Feb.
2
Unitig level assembly graph based metagenome-assembled genome refiner (UGMAGrefiner): A tool to increase completeness and resolution of metagenome-assembled genomes.基于单条重叠群水平组装图的宏基因组组装基因组优化器(UGMAGrefiner):一种提高宏基因组组装基因组完整性和分辨率的工具。
Comput Struct Biotechnol J. 2023 Mar 21;21:2394-2404. doi: 10.1016/j.csbj.2023.03.030. eCollection 2023.
3

本文引用的文献

1
BtToxin_Digger: a comprehensive and high-throughput pipeline for mining toxin protein genes from Bacillus thuringiensis.BtToxin_Digger:一种从苏云金芽孢杆菌中挖掘毒素蛋白基因的全面、高通量的流水线。
Bioinformatics. 2021 Dec 22;38(1):250-251. doi: 10.1093/bioinformatics/btab506.
2
Pfam: The protein families database in 2021.Pfam:2021 年的蛋白质家族数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.
3
SPAligner: alignment of long diverged molecular sequences to assembly graphs.SPAligner:将长距离分化的分子序列比对到组装图谱上。
STRONG: metagenomics strain resolution on assembly graphs.
基于组装图的宏基因组菌株分辨率
Genome Biol. 2021 Jul 26;22(1):214. doi: 10.1186/s13059-021-02419-7.
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):306. doi: 10.1186/s12859-020-03590-7.
4
A structure-based nomenclature for Bacillus thuringiensis and other bacteria-derived pesticidal proteins.基于结构的苏云金芽孢杆菌和其他细菌衍生的杀虫蛋白命名法。
J Invertebr Pathol. 2021 Nov;186:107438. doi: 10.1016/j.jip.2020.107438. Epub 2020 Jul 9.
5
No More Tears: Mining Sequencing Data for Novel Cry Toxins with CryProcessor.不再流泪:使用 CryProcessor 从测序数据中挖掘新型 Cry 毒素。
Toxins (Basel). 2020 Mar 23;12(3):204. doi: 10.3390/toxins12030204.
6
GeneHunt for rapid domain-specific annotation of glycoside hydrolases.基因亨特:快速进行糖苷水解酶的特定领域注释。
Sci Rep. 2019 Jul 12;9(1):10137. doi: 10.1038/s41598-019-46290-w.
7
GRASP2: fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data.GRASP2:用于宏基因组测序数据的快速、高效、基于基因的组装和同源搜索。
BMC Bioinformatics. 2019 Jun 6;20(Suppl 11):276. doi: 10.1186/s12859-019-2818-1.
8
BiosyntheticSPAdes: reconstructing biosynthetic gene clusters from assembly graphs.BiosyntheticSPAdes:从组装图重建生物合成基因簇。
Genome Res. 2019 Aug;29(8):1352-1362. doi: 10.1101/gr.243477.118. Epub 2019 Jun 3.
9
Plasmid detection and assembly in genomic and metagenomic data sets.质粒检测和组装在基因组和宏基因组数据集。
Genome Res. 2019 Jun;29(6):961-968. doi: 10.1101/gr.241299.118. Epub 2019 May 2.
10
Insights into the draft genome sequence of bioactives-producing DNG9 isolated from Algerian soil-oil slough.对从阿尔及利亚土壤油泥中分离出的生物活性物质产生菌DNG9基因组草图序列的见解。
Stand Genomic Sci. 2018 Oct 11;13:25. doi: 10.1186/s40793-018-0331-1. eCollection 2018.