• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

pyPaSWAS:基于Python的多核CPU和GPU序列比对工具。

pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.

作者信息

Warris Sven, Timal N Roshan N, Kempenaar Marcel, Poortinga Arne M, van de Geest Henri, Varbanescu Ana L, Nap Jan-Peter

机构信息

Expertise Centre ALIFE, Institute for Life Science & Technology, Hanze University of Applied Sciences Groningen, Groningen, the Netherlands.

Applied Bioinformatics, Wageningen University and Research, Wageningen, the Netherlands.

出版信息

PLoS One. 2018 Jan 2;13(1):e0190279. doi: 10.1371/journal.pone.0190279. eCollection 2018.

DOI:10.1371/journal.pone.0190279
PMID:29293576
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5749749/
Abstract

BACKGROUND

Our previously published CUDA-only application PaSWAS for Smith-Waterman (SW) sequence alignment of any type of sequence on NVIDIA-based GPUs is platform-specific and therefore adopted less than could be. The OpenCL language is supported more widely and allows use on a variety of hardware platforms. Moreover, there is a need to promote the adoption of parallel computing in bioinformatics by making its use and extension more simple through more and better application of high-level languages commonly used in bioinformatics, such as Python.

RESULTS

The novel application pyPaSWAS presents the parallel SW sequence alignment code fully packed in Python. It is a generic SW implementation running on several hardware platforms with multi-core systems and/or GPUs that provides accurate sequence alignments that also can be inspected for alignment details. Additionally, pyPaSWAS support the affine gap penalty. Python libraries are used for automated system configuration, I/O and logging. This way, the Python environment will stimulate further extension and use of pyPaSWAS.

CONCLUSIONS

pyPaSWAS presents an easy Python-based environment for accurate and retrievable parallel SW sequence alignments on GPUs and multi-core systems. The strategy of integrating Python with high-performance parallel compute languages to create a developer- and user-friendly environment should be considered for other computationally intensive bioinformatics algorithms.

摘要

背景

我们之前发布的仅支持CUDA的应用程序PaSWAS,用于在基于NVIDIA的GPU上对任何类型的序列进行史密斯-沃特曼(SW)序列比对,它是特定于平台的,因此采用率低于预期。OpenCL语言得到更广泛的支持,并允许在各种硬件平台上使用。此外,有必要通过更广泛、更好地应用生物信息学中常用的高级语言(如Python),使并行计算的使用和扩展更加简单,从而促进其在生物信息学中的采用。

结果

新颖的应用程序pyPaSWAS展示了完全用Python编写的并行SW序列比对代码。它是一个通用的SW实现,可在具有多核系统和/或GPU的多个硬件平台上运行,提供准确的序列比对,还可以检查比对细节。此外,pyPaSWAS支持仿射间隙罚分。Python库用于自动系统配置、输入/输出和日志记录。通过这种方式,Python环境将促进pyPaSWAS的进一步扩展和使用。

结论

pyPaSWAS为在GPU和多核系统上进行准确且可检索的并行SW序列比对提供了一个简单的基于Python的环境。对于其他计算密集型的生物信息学算法,应考虑将Python与高性能并行计算语言集成,以创建一个对开发者和用户都友好的环境的策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f564/5749749/a0a82e70550c/pone.0190279.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f564/5749749/a0a82e70550c/pone.0190279.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f564/5749749/a0a82e70550c/pone.0190279.g001.jpg

相似文献

1
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.pyPaSWAS:基于Python的多核CPU和GPU序列比对工具。
PLoS One. 2018 Jan 2;13(1):e0190279. doi: 10.1371/journal.pone.0190279. eCollection 2018.
2
FPGASW: Accelerating Large-Scale Smith-Waterman Sequence Alignment Application with Backtracking on FPGA Linear Systolic Array.FPGA 软核:在 FPGA 线性脉动阵列上回溯实现大规模 Smith-Waterman 序列比对应用的加速。
Interdiscip Sci. 2018 Mar;10(1):176-188. doi: 10.1007/s12539-017-0225-8. Epub 2017 Apr 21.
3
ADEPT: a domain independent sequence alignment strategy for gpu architectures.ADEPT:一种适用于 GPU 架构的与领域无关的序列比对策略。
BMC Bioinformatics. 2020 Sep 15;21(1):406. doi: 10.1186/s12859-020-03720-1.
4
Speeding-up Bioinformatics Algorithms with Heterogeneous Architectures: Highly Heterogeneous Smith-Waterman (HHeterSW).利用异构架构加速生物信息学算法:高度异构的史密斯-沃特曼算法(HHeterSW)
J Comput Biol. 2016 Oct;23(10):801-9. doi: 10.1089/cmb.2015.0237. Epub 2016 Apr 22.
5
CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment.CUDA兼容的GPU卡作为用于Smith-Waterman序列比对的高效硬件加速器。
BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S10. doi: 10.1186/1471-2105-9-S2-S10.
6
SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences.SWIFOLD:基于OpenCL在FPGA上实现的用于长DNA序列的史密斯-沃特曼算法
BMC Syst Biol. 2018 Nov 20;12(Suppl 5):96. doi: 10.1186/s12918-018-0614-6.
7
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.CUDASW++ 3.0:通过结合 CPU 和 GPU 的 SIMD 指令来加速 Smith-Waterman 蛋白质数据库搜索。
BMC Bioinformatics. 2013 Apr 4;14:117. doi: 10.1186/1471-2105-14-117.
8
Protein alignment algorithms with an efficient backtracking routine on multiple GPUs.在多个 GPU 上使用高效回溯例程的蛋白质比对算法。
BMC Bioinformatics. 2011 May 20;12:181. doi: 10.1186/1471-2105-12-181.
9
Design and implementation of a hybrid MPI-CUDA model for the Smith-Waterman algorithm.用于史密斯-沃特曼算法的混合MPI-CUDA模型的设计与实现。
Int J Data Min Bioinform. 2015;12(3):313-27. doi: 10.1504/ijdmb.2015.069710.
10
Arioc: High-concurrency short-read alignment on multiple GPUs.Arioc:在多个 GPU 上进行高并发性短读对齐。
PLoS Comput Biol. 2020 Nov 9;16(11):e1008383. doi: 10.1371/journal.pcbi.1008383. eCollection 2020 Nov.

引用本文的文献

1
Classification of the plant-associated lifestyle of Pseudomonas strains using genome properties and machine learning.利用基因组特性和机器学习对假单胞菌菌株的植物相关生活方式进行分类。
Sci Rep. 2022 Jun 27;12(1):10857. doi: 10.1038/s41598-022-14913-4.
2
Draft Genome Sequences of Three Isolates of sp. Basidiomycete Fungi Isolated from Powdery Mildew Pustules.从白粉病脓疱中分离出的三种担子菌真菌菌株的基因组序列草图
Microbiol Resour Announc. 2020 Jun 4;9(23):e00473-20. doi: 10.1128/MRA.00473-20.
3
Correcting palindromes in long reads after whole-genome amplification.

本文引用的文献

1
Big Data Application in Biomedical Research and Health Care: A Literature Review.大数据在生物医学研究与医疗保健中的应用:文献综述
Biomed Inform Insights. 2016 Jan 19;8:1-10. doi: 10.4137/BII.S31559. eCollection 2016.
2
Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies.基于群体的变异检测,利用下一代超级计算技术进行大规模全基因组测序研究。
BMC Bioinformatics. 2015 Sep 22;16(1):304. doi: 10.1186/s12859-015-0736-4.
3
Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.
长读段全基因组扩增后回文序列的校正。
BMC Genomics. 2018 Nov 6;19(1):798. doi: 10.1186/s12864-018-5164-1.
使用PaSWAS在通用并行图形处理单元上进行灵活、快速且准确的序列比对分析。
PLoS One. 2015 Apr 1;10(4):e0122524. doi: 10.1371/journal.pone.0122524. eCollection 2015.
4
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.CUDASW++ 3.0:通过结合 CPU 和 GPU 的 SIMD 指令来加速 Smith-Waterman 蛋白质数据库搜索。
BMC Bioinformatics. 2013 Apr 4;14:117. doi: 10.1186/1471-2105-14-117.
5
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.Hadoop/MapReduce/HBase 框架概述及其在生物信息学中的当前应用。
BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-11-S12-S1.
6
Fast and accurate protein substructure searching with simulated annealing and GPUs.使用模拟退火和 GPU 进行快速准确的蛋白质亚结构搜索。
BMC Bioinformatics. 2010 Sep 3;11:446. doi: 10.1186/1471-2105-11-446.
7
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.Galaxy:一种支持生命科学领域可访问、可重现和透明计算研究的综合方法。
Genome Biol. 2010;11(8):R86. doi: 10.1186/gb-2010-11-8-r86. Epub 2010 Aug 25.
8
CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions.CUDASW++2.0:基于单指令多线程(SIMT)和虚拟化单指令多数据(SIMD)抽象,在支持CUDA的图形处理器(GPU)上增强史密斯-沃特曼蛋白质数据库搜索功能。
BMC Res Notes. 2010 Apr 6;3:93. doi: 10.1186/1756-0500-3-93.
9
Fast and accurate long-read alignment with Burrows-Wheeler transform.基于 Burrows-Wheeler 变换的快速准确长读比对。
Bioinformatics. 2010 Mar 1;26(5):589-95. doi: 10.1093/bioinformatics/btp698. Epub 2010 Jan 15.
10
CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units.CUDASW++:针对支持CUDA的图形处理单元优化史密斯-沃特曼序列数据库搜索
BMC Res Notes. 2009 May 6;2:73. doi: 10.1186/1756-0500-2-73.