• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GEKKO 是一种用于分类和探索高通量测序数据的遗传算法。

GECKO is a genetic algorithm to classify and explore high throughput sequencing data.

机构信息

1Institute of Human Genetics, CNRS UPR1142, Machine learning and gene regulation, University of Montpellier, Montpellier, France.

2AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France.

出版信息

Commun Biol. 2019 Jun 20;2:222. doi: 10.1038/s42003-019-0456-9. eCollection 2019.

DOI:10.1038/s42003-019-0456-9
PMID:31240260
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6586863/
Abstract

Comparative analysis of high throughput sequencing data between multiple conditions often involves mapping of sequencing reads to a reference and downstream bioinformatics analyses. Both of these steps may introduce heavy bias and potential data loss. This is especially true in studies where patient transcriptomes or genomes may vary from their references, such as in cancer. Here we describe a novel approach and associated software that makes use of advances in genetic algorithms and feature selection to comprehensively explore massive volumes of sequencing data to classify and discover new sequences of interest without a mapping step and without intensive use of specialized bioinformatics pipelines. We demonstrate that our approach called GECKO for GEnetic Classification using Optimization is effective at classifying and extracting meaningful sequences from multiple types of sequencing approaches including mRNA, microRNA, and DNA methylome data.

摘要

对多种条件下的高通量测序数据进行比较分析通常涉及将测序reads 映射到参考基因组上,以及下游的生物信息学分析。这两个步骤都可能引入严重的偏差和潜在的数据丢失。在研究中,当患者的转录组或基因组与参考基因组不同时,这种情况尤其如此,例如在癌症中。在这里,我们描述了一种新的方法和相关软件,该方法利用遗传算法和特征选择的进展,全面探索大量测序数据,无需映射步骤和密集使用专门的生物信息学管道,即可对感兴趣的新序列进行分类和发现。我们证明,我们的方法称为 GECKO(使用优化进行基因分类),可以有效地对包括 mRNA、microRNA 和 DNA 甲基化组数据在内的多种测序方法进行分类和提取有意义的序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/cbcd41540ffa/42003_2019_456_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/3d771a74043d/42003_2019_456_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/4a4b80c238d5/42003_2019_456_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/66dce9dfe538/42003_2019_456_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/da4b37d6e219/42003_2019_456_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/cbcd41540ffa/42003_2019_456_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/3d771a74043d/42003_2019_456_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/4a4b80c238d5/42003_2019_456_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/66dce9dfe538/42003_2019_456_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/da4b37d6e219/42003_2019_456_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b22b/6586863/cbcd41540ffa/42003_2019_456_Fig5_HTML.jpg

相似文献

1
GECKO is a genetic algorithm to classify and explore high throughput sequencing data.GEKKO 是一种用于分类和探索高通量测序数据的遗传算法。
Commun Biol. 2019 Jun 20;2:222. doi: 10.1038/s42003-019-0456-9. eCollection 2019.
2
Machine learning random forest for predicting oncosomatic variant NGS analysis.机器学习随机森林预测肿瘤体细胞变异 NGS 分析。
Sci Rep. 2021 Nov 8;11(1):21820. doi: 10.1038/s41598-021-01253-y.
3
Correcting the Estimation of Viral Taxa Distributions in Next-Generation Sequencing Data after Applying Artificial Neural Networks.应用人工神经网络后校正下一代测序数据中病毒分类群分布的估计。
Genes (Basel). 2021 Oct 31;12(11):1755. doi: 10.3390/genes12111755.
4
Bioinformatic Analysis of Small RNA Sequencing Libraries.小RNA测序文库的生物信息学分析
Methods Mol Biol. 2019;1932:51-63. doi: 10.1007/978-1-4939-9042-9_4.
5
Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data.高通量测序中使用的映射算法比较:应用于Ion Torrent数据
BMC Genomics. 2014 Apr 5;15:264. doi: 10.1186/1471-2164-15-264.
6
A hybrid and scalable error correction algorithm for indel and substitution errors of long reads.一种用于长读段插入/缺失和替换错误的混合可扩展纠错算法。
BMC Genomics. 2019 Dec 20;20(Suppl 11):948. doi: 10.1186/s12864-019-6286-9.
7
Assessing the impact of exact reads on reducing the error rate of read mapping.评估精确读取对降低读取映射错误率的影响。
BMC Bioinformatics. 2018 Nov 6;19(1):406. doi: 10.1186/s12859-018-2432-7.
8
An introduction to high-throughput sequencing experiments: design and bioinformatics analysis.高通量测序实验介绍:设计与生物信息学分析
Methods Mol Biol. 2013;1038:1-26. doi: 10.1007/978-1-62703-514-9_1.
9
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.PLEK:一种基于改进的k-mer方案预测长链非编码RNA和信使RNA的工具。
BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311.
10
Systematic benchmark of ancient DNA read mapping.系统评估古 DNA 读段映射。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab076.

引用本文的文献

1
A survey of k-mer methods and applications in bioinformatics.生物信息学中k-mer方法及其应用综述。
Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.
2
New progress in the role of microRNAs in the diagnosis and prognosis of triple negative breast cancer.微小RNA在三阴性乳腺癌诊断和预后中的作用的新进展
Front Mol Biosci. 2023 Apr 13;10:1162463. doi: 10.3389/fmolb.2023.1162463. eCollection 2023.
3
RNA methylation and cellular response to oxidative stress-promoting anticancer agents.

本文引用的文献

1
KrakenUniq: confident and fast metagenomics classification using unique k-mer counts.KrakenUniq:基于独特的 k-mer 计数实现自信且快速的宏基因组分类。
Genome Biol. 2018 Nov 16;19(1):198. doi: 10.1186/s13059-018-1568-0.
2
Functional classification of long non-coding RNAs by k-mer content.基于 k- -mer 含量对长非编码 RNA 进行功能分类。
Nat Genet. 2018 Oct;50(10):1474-1482. doi: 10.1038/s41588-018-0207-8. Epub 2018 Sep 17.
3
TPX2/Aurora kinase A signaling as a potential therapeutic target in genomically unstable cancer cells.
RNA 甲基化与细胞对氧化应激促进抗癌药物的反应。
Cell Cycle. 2023 Apr;22(8):870-905. doi: 10.1080/15384101.2023.2165632. Epub 2023 Jan 17.
4
The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma.未知RNA序列对肺腺癌肿瘤特征的贡献。
NAR Cancer. 2022 Feb 1;4(1):zcac001. doi: 10.1093/narcan/zcac001. eCollection 2022 Mar.
5
PseudoGA: cell pseudotime reconstruction based on genetic algorithm.伪 GA:基于遗传算法的细胞伪时间重建。
Nucleic Acids Res. 2021 Aug 20;49(14):7909-7924. doi: 10.1093/nar/gkab457.
6
Reference-free transcriptome signatures for prostate cancer prognosis.无参考转录组特征用于前列腺癌预后。
BMC Cancer. 2021 Apr 12;21(1):394. doi: 10.1186/s12885-021-08021-1.
7
In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm.利用机器学习和遗传算法的混合方法进行新型适体的计算机设计。
Mol Divers. 2021 Aug;25(3):1395-1407. doi: 10.1007/s11030-021-10192-9. Epub 2021 Feb 7.
8
iMOKA: k-mer based software to analyze large collections of sequencing data.iMOKA:基于 k-mer 的软件,用于分析大量测序数据。
Genome Biol. 2020 Oct 13;21(1):261. doi: 10.1186/s13059-020-02165-2.
9
The Maintenance of Mitochondrial DNA Integrity and Dynamics by Mitochondrial Membranes.线粒体膜对线粒体DNA完整性和动态性的维持
Life (Basel). 2020 Aug 26;10(9):164. doi: 10.3390/life10090164.
TPX2/极光激酶 A 信号作为基因组不稳定癌细胞的潜在治疗靶点。
Oncogene. 2019 Feb;38(6):852-867. doi: 10.1038/s41388-018-0470-2. Epub 2018 Sep 3.
4
Activating Transcription Factor 3 as a Novel Regulator of Chemotherapy Response in Breast Cancer.激活转录因子3作为乳腺癌化疗反应的新型调节因子
Transl Oncol. 2018 Aug;11(4):988-998. doi: 10.1016/j.tranon.2018.06.001. Epub 2018 Jun 22.
5
Loss of KLHL6 promotes diffuse large B-cell lymphoma growth and survival by stabilizing the mRNA decay factor roquin2.KLHL6 的缺失通过稳定 mRNA 降解因子 roquin2 促进弥漫性大 B 细胞淋巴瘤的生长和存活。
Nat Cell Biol. 2018 May;20(5):586-596. doi: 10.1038/s41556-018-0084-5. Epub 2018 Apr 25.
6
Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer.起源细胞模式主导了 33 种癌症类型的 10000 个肿瘤的分子分类。
Cell. 2018 Apr 5;173(2):291-304.e6. doi: 10.1016/j.cell.2018.03.022.
7
On the efficiency of nature-inspired metaheuristics in expensive global optimization with limited budget.基于自然启发式元启发算法在有限预算下昂贵的全局优化中的效率。
Sci Rep. 2018 Jan 11;8(1):453. doi: 10.1038/s41598-017-18940-4.
8
DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition.DE-kupl:通过 k-mer 分解实现 RNA-seq 数据中生物变异的全面捕获。
Genome Biol. 2017 Dec 28;18(1):243. doi: 10.1186/s13059-017-1372-2.
9
DNA methylation at enhancers identifies distinct breast cancer lineages.增强子上的 DNA 甲基化可识别出不同的乳腺癌谱系。
Nat Commun. 2017 Nov 9;8(1):1379. doi: 10.1038/s41467-017-00510-x.
10
A comprehensive, cell specific microRNA catalogue of human peripheral blood.人类外周血的全面、细胞特异性微小RNA目录。
Nucleic Acids Res. 2017 Sep 19;45(16):9290-9301. doi: 10.1093/nar/gkx706.