• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
KIMI: Knockoff Inference for Motif Identification from molecular sequences with controlled false discovery rate.KIMI:具有控制假发现率的分子序列 motif 识别的仿射推理。
Bioinformatics. 2021 May 5;37(6):759-766. doi: 10.1093/bioinformatics/btaa912.
2
Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis.用于宏基因组差异分析的k-mer谱适用性评估。
BMC Bioinformatics. 2016 Jan 16;17:38. doi: 10.1186/s12859-015-0875-7.
3
GraphBin: refined binning of metagenomic contigs using assembly graphs.GraphBin:使用组装图对宏基因组序列进行精细化分箱。
Bioinformatics. 2020 Jun 1;36(11):3307-3313. doi: 10.1093/bioinformatics/btaa180.
4
CoCoNet: an efficient deep learning tool for viral metagenome binning.CoCoNet:一种用于病毒宏基因组分箱的高效深度学习工具。
Bioinformatics. 2021 Sep 29;37(18):2803-2810. doi: 10.1093/bioinformatics/btab213.
5
Quality control of microbiota metagenomics by k-mer analysis.通过k-mer分析进行微生物群落宏基因组学的质量控制
BMC Genomics. 2015 Mar 14;16(1):183. doi: 10.1186/s12864-015-1406-7.
6
K2Mem: Discovering Discriminative K-mers From Sequencing Data for Metagenomic Reads Classification.K2Mem:从测序数据中发现用于宏基因组读分类的判别 K- mers。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):220-229. doi: 10.1109/TCBB.2021.3117406. Epub 2022 Feb 3.
7
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data.VirFinder:一种新型的基于 k-mer 的工具,用于从组装的宏基因组数据中识别病毒序列。
Microbiome. 2017 Jul 6;5(1):69. doi: 10.1186/s40168-017-0283-5.
8
MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage.MetaCon:基于概率 k- -mer 统计和覆盖度的无监督宏基因组序列聚类
BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):367. doi: 10.1186/s12859-019-2904-4.
9
MetaProb: accurate metagenomic reads binning based on probabilistic sequence signatures.MetaProb:基于概率序列特征的准确宏基因组 reads 分箱
Bioinformatics. 2016 Sep 1;32(17):i567-i575. doi: 10.1093/bioinformatics/btw466.
10
Estimating the total genome length of a metagenomic sample using k-mers.利用 k- -mer 估算宏基因组样本的总基因组长度。
BMC Genomics. 2019 Apr 4;20(Suppl 2):183. doi: 10.1186/s12864-019-5467-x.

引用本文的文献

1
DeepLINK: Deep learning inference using knockoffs with applications to genomics.DeepLINK:使用 Knockoffs 进行深度学习推断及其在基因组学中的应用。
Proc Natl Acad Sci U S A. 2021 Sep 7;118(36). doi: 10.1073/pnas.2104683118.

本文引用的文献

1
Identifying viruses from metagenomic data using deep learning.利用深度学习从宏基因组数据中识别病毒。
Quant Biol. 2020 Mar;8(1):64-77. doi: 10.1007/s40484-019-0187-4.
2
IPAD: Stable Interpretable Forecasting with Knockoffs Inference.IPAD:基于仿冒品推断的稳定可解释预测
J Am Stat Assoc. 2020;115(532):1822-1834. doi: 10.1080/01621459.2019.1654878. Epub 2019 Sep 17.
3
RANK: Large-Scale Inference with Graphical Nonlinear Knockoffs.RANK:基于图形非线性仿样的大规模推断
J Am Stat Assoc. 2020;115(529):362-379. doi: 10.1080/01621459.2018.1546589. Epub 2019 Apr 11.
4
PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning.PPR-Meta:一种使用深度学习从宏基因组片段中识别噬菌体和质粒的工具。
Gigascience. 2019 Jun 1;8(6). doi: 10.1093/gigascience/giz066.
5
Gene hunting with hidden Markov model knockoffs.使用隐马尔可夫模型仿样进行基因搜寻。
Biometrika. 2019 Mar;106(1):1-18. doi: 10.1093/biomet/asy033. Epub 2018 Aug 4.
6
Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium.微生物组质量控制(MBQC)项目联盟对微生物群落扩增子测序变异的评估。
Nat Biotechnol. 2017 Nov;35(11):1077-1086. doi: 10.1038/nbt.3981. Epub 2017 Oct 2.
7
VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data.VirFinder:一种新型的基于 k-mer 的工具,用于从组装的宏基因组数据中识别病毒序列。
Microbiome. 2017 Jul 6;5(1):69. doi: 10.1186/s40168-017-0283-5.
8
VirSorter: mining viral signal from microbial genomic data.VirSorter:从微生物基因组数据中挖掘病毒信号。
PeerJ. 2015 May 28;3:e985. doi: 10.7717/peerj.985. eCollection 2015.
9
Alterations of the human gut microbiome in liver cirrhosis.肝硬化患者的肠道微生物组变化。
Nature. 2014 Sep 4;513(7516):59-64. doi: 10.1038/nature13568. Epub 2014 Jul 23.
10
PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies.PhiSpy:一种新型算法,用于在细菌基因组中寻找噬菌体,该算法结合了基于相似性和组成的策略。
Nucleic Acids Res. 2012 Sep;40(16):e126. doi: 10.1093/nar/gks406. Epub 2012 May 14.

KIMI:具有控制假发现率的分子序列 motif 识别的仿射推理。

KIMI: Knockoff Inference for Motif Identification from molecular sequences with controlled false discovery rate.

机构信息

Quantitative and Computational Biology Program, Department of Biological Sciences, Los Angeles, CA 90089, USA.

Data Sciences and Operations Department, Marshall School of Business, University of Southern California, Los Angeles, CA 90089, USA.

出版信息

Bioinformatics. 2021 May 5;37(6):759-766. doi: 10.1093/bioinformatics/btaa912.

DOI:10.1093/bioinformatics/btaa912
PMID:33119059
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8599924/
Abstract

MOTIVATION

The rapid development of sequencing technologies has enabled us to generate a large number of metagenomic reads from genetic materials in microbial communities, making it possible to gain deep insights into understanding the differences between the genetic materials of different groups of microorganisms, such as bacteria, viruses, plasmids, etc. Computational methods based on k-mer frequencies have been shown to be highly effective for classifying metagenomic sequencing reads into different groups. However, such methods usually use all the k-mers as features for prediction without selecting relevant k-mers for the different groups of sequences, i.e. unique nucleotide patterns containing biological significance.

RESULTS

To select k-mers for distinguishing different groups of sequences with guaranteed false discovery rate (FDR) control, we develop KIMI, a general framework based on model-X Knockoffs regarded as the state-of-the-art statistical method for FDR control, for sequence motif discovery with arbitrary target FDR level, such that reproducibility can be theoretically guaranteed. KIMI is shown through simulation studies to be effective in simultaneously controlling FDR and yielding high power, outperforming the broadly used Benjamini-Hochberg procedure and the q-value method for FDR control. To illustrate the usefulness of KIMI in analyzing real datasets, we take the viral motif discovery problem as an example and implement KIMI on a real dataset consisting of viral and bacterial contigs. We show that the accuracy of predicting viral and bacterial contigs can be increased by training the prediction model only on relevant k-mers selected by KIMI.

AVAILABILITYAND IMPLEMENTATION

Our implementation of KIMI is available at https://github.com/xinbaiusc/KIMI.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

测序技术的快速发展使我们能够从微生物群落的遗传物质中生成大量的宏基因组读数,从而深入了解不同微生物群体(如细菌、病毒、质粒等)的遗传物质之间的差异。基于 k-mer 频率的计算方法已被证明非常有效地将宏基因组测序读数分类为不同的组。然而,这些方法通常使用所有 k-mers 作为特征进行预测,而没有选择与不同序列组相关的 k-mers,即包含生物学意义的独特核苷酸模式。

结果

为了选择具有保证错误发现率(FDR)控制的 k-mers 来区分不同组的序列,我们开发了 KIMI,这是一种基于模型-X Knockoffs 的通用框架,被认为是 FDR 控制的最新统计方法,用于具有任意目标 FDR 水平的序列基序发现,从而可以理论上保证可重复性。通过模拟研究表明,KIMI 在同时控制 FDR 和产生高功效方面非常有效,优于广泛使用的 Benjamini-Hochberg 程序和 q 值方法进行 FDR 控制。为了说明 KIMI 在分析真实数据集方面的有用性,我们以病毒基序发现问题为例,并在由病毒和细菌连续体组成的真实数据集上实现了 KIMI。我们表明,通过仅在 KIMI 选择的相关 k-mers 上训练预测模型,可以提高预测病毒和细菌连续体的准确性。

可用性和实现

我们的 KIMI 实现可在 https://github.com/xinbaiusc/KIMI 上获得。

补充信息

补充数据可在生物信息学在线获得。