并行平台上高通量生物数据处理的案例研究。

A case study of high-throughput biological data processing on parallel platforms.

作者信息

Pekurovsky D, Shindyalov I N, Bourne P E

机构信息

San Diego Supercomputer Center, University of California San Diego, La Jolla 92093, USA.

出版信息

Bioinformatics. 2004 Aug 12;20(12):1940-7. doi: 10.1093/bioinformatics/bth184. Epub 2004 Mar 25.

DOI:10.1093/bioinformatics/bth184

PMID:15044237

Abstract

MOTIVATION

Analysis of large biological data sets using a variety of parallel processor computer architectures is a common task in bioinformatics. The efficiency of the analysis can be significantly improved by properly handling redundancy present in these data combined with taking advantage of the unique features of these compute architectures.

RESULTS

We describe a generalized approach to this analysis, but present specific results using the program CEPAR, an efficient implementation of the Combinatorial Extension algorithm in a massively parallel (PAR) mode for finding pairwise protein structure similarities and aligning protein structures from the Protein Data Bank. CEPAR design and implementation are described and results provided for the efficiency of the algorithm when run on a large number of processors.

AVAILABILITY

Source code is available by contacting one of the authors.

摘要

动机

使用各种并行处理器计算机架构分析大型生物数据集是生物信息学中的常见任务。通过妥善处理这些数据中存在的冗余，并利用这些计算架构的独特特性，可显著提高分析效率。

结果

我们描述了这种分析的通用方法，但使用程序CEPAR展示了具体结果。CEPAR是组合扩展算法在大规模并行（PAR）模式下的高效实现，用于从蛋白质数据库中查找成对蛋白质结构相似性并比对蛋白质结构。文中描述了CEPAR的设计与实现，并给出了该算法在大量处理器上运行时的效率结果。

可用性

可通过联系作者之一获取源代码。

相似文献

A case study of high-throughput biological data processing on parallel platforms.并行平台上高通量生物数据处理的案例研究。

Bioinformatics. 2004 Aug 12;20(12):1940-7. doi: 10.1093/bioinformatics/bth184. Epub 2004 Mar 25.

FORTE: a profile-profile comparison tool for protein fold recognition.FORTE：一种用于蛋白质折叠识别的轮廓-轮廓比较工具。

Bioinformatics. 2004 Mar 1;20(4):594-5. doi: 10.1093/bioinformatics/btg474. Epub 2004 Feb 5.

Protein structural similarity search by Ramachandran codes.通过拉马钱德兰编码进行蛋白质结构相似性搜索。

BMC Bioinformatics. 2007 Aug 23;8:307. doi: 10.1186/1471-2105-8-307.

J Biomed Inform. 2008 Feb;41(1):65-81. doi: 10.1016/j.jbi.2007.05.010. Epub 2007 Jun 27.

NdPASA: a novel pairwise protein sequence alignment algorithm that incorporates neighbor-dependent amino acid propensities.NdPASA：一种整合了邻域依赖氨基酸倾向的新型双序列蛋白质序列比对算法。

Proteins. 2005 Feb 15;58(3):628-37. doi: 10.1002/prot.20359.

An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences.一种用于挖掘未比对蛋白质序列中频繁模式的高效、通用且可扩展的模式增长方法。

Bioinformatics. 2007 Mar 15;23(6):687-93. doi: 10.1093/bioinformatics/btl665. Epub 2007 Jan 19.

SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures.SPEM：利用序列概况和预测的二级结构改进多序列比对

Bioinformatics. 2005 Sep 15;21(18):3615-21. doi: 10.1093/bioinformatics/bti582. Epub 2005 Jul 14.

A comprehensive and non-redundant database of protein domain movements.一个全面且无冗余的蛋白质结构域运动数据库。

Bioinformatics. 2005 Jun 15;21(12):2832-8. doi: 10.1093/bioinformatics/bti420. Epub 2005 Mar 31.

On distance and similarity in fold space.关于折叠空间中的距离和相似性。

Bioinformatics. 2008 Mar 15;24(6):872-3. doi: 10.1093/bioinformatics/btn040. Epub 2008 Jan 28.

A generalized affine gap model significantly improves protein sequence alignment accuracy.广义仿射间隙模型显著提高了蛋白质序列比对的准确性。

Proteins. 2005 Feb 1;58(2):329-38. doi: 10.1002/prot.20299.

引用本文的文献

Accelerating large-scale protein structure alignments with graphics processing units.利用图形处理单元加速大规模蛋白质结构比对

BMC Res Notes. 2012 Feb 22;5:116. doi: 10.1186/1756-0500-5-116.

The Sleipnir library for computational functional genomics.用于计算功能基因组学的斯莱普尼尔库。

Bioinformatics. 2008 Jul 1;24(13):1559-61. doi: 10.1093/bioinformatics/btn237. Epub 2008 May 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

并行平台上高通量生物数据处理的案例研究。

A case study of high-throughput biological data processing on parallel platforms.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献