黑云母：一个用 Python 实现的统一的开源计算生物学框架。

Biotite: a unifying open source computational biology framework in Python.

机构信息

Department of Computational Biology and Simulation, TU Darmstadt, Schnittspahnstraße 2, Darmstadt, 64287, Germany.

出版信息

BMC Bioinformatics. 2018 Oct 1;19(1):346. doi: 10.1186/s12859-018-2367-z.

DOI:10.1186/s12859-018-2367-z

PMID:30285630

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6167853/

Abstract

BACKGROUND

As molecular biology is creating an increasing amount of sequence and structure data, the multitude of software to analyze this data is also rising. Most of the programs are made for a specific task, hence the user often needs to combine multiple programs in order to reach a goal. This can make the data processing unhandy, inflexible and even inefficient due to an overhead of read/write operations. Therefore, it is crucial to have a comprehensive, accessible and efficient computational biology framework in a scripting language to overcome these limitations.

RESULTS

We have developed the Python package Biotite: a general computational biology framework, that represents sequence and structure data based on NumPyndarrays. Furthermore the package contains seamless interfaces to biological databases and external software. The source code is freely accessible at https://github.com/biotite-dev/biotite .

CONCLUSIONS

Biotite is unifying in two ways: At first it bundles popular tasks in sequence analysis and structural bioinformatics in a consistently structured package. Secondly it adresses two groups of users: novice programmers get an easy access to Biotite due to its simplicity and the comprehensive documentation. On the other hand, advanced users can profit from its high performance and extensibility. They can implement their algorithms upon Biotite, so they can skip writing code for general functionality (like file parsers) and can focus on what their software makes unique.

摘要

背景

随着分子生物学不断产生越来越多的序列和结构数据，用于分析这些数据的软件也在不断增加。大多数程序都是针对特定任务而设计的，因此用户通常需要组合多个程序才能实现目标。由于读写操作的开销，这可能会使数据处理变得麻烦、不灵活甚至效率低下。因此，在脚本语言中拥有一个全面、可访问且高效的计算生物学框架来克服这些限制至关重要。

结果

我们开发了 Python 包 Biotite：一个通用的计算生物学框架，它基于 NumPyndarrays 表示序列和结构数据。此外，该包还包含与生物数据库和外部软件的无缝接口。源代码可在 https://github.com/biotite-dev/biotite 上免费获取。

结论

Biotite 具有两个方面的统一性：首先，它将序列分析和结构生物信息学中的常见任务捆绑在一个结构一致的包中。其次，它针对两类用户：由于其简单性和全面的文档，新手程序员可以轻松访问 Biotite。另一方面，高级用户可以从其高性能和可扩展性中受益。他们可以在 Biotite 上实现自己的算法，从而可以跳过编写通用功能（如文件解析器）的代码，并专注于使他们的软件具有独特性的功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa15/6167853/0e22bc60919d/12859_2018_2367_Fig1_HTML.jpg

相似文献

Biotite: a unifying open source computational biology framework in Python.

BMC Bioinformatics. 2018 Oct 1;19(1):346. doi: 10.1186/s12859-018-2367-z.

Biotite: new tools for a versatile Python bioinformatics library.

BMC Bioinformatics. 2023 Jun 5;24(1):236. doi: 10.1186/s12859-023-05345-6.

plotnineSeqSuite: a Python package for visualizing sequence data using ggplot2 style.

BMC Genomics. 2023 Oct 3;24(1):585. doi: 10.1186/s12864-023-09677-8.

BioShell 3.0: Library for Processing Structural Biology Data.

Biomolecules. 2020 Mar 16;10(3):461. doi: 10.3390/biom10030461.

PyBEL: a computational framework for Biological Expression Language.

Bioinformatics. 2018 Feb 15;34(4):703-704. doi: 10.1093/bioinformatics/btx660.

DendroPy: a Python library for phylogenetic computing.

Bioinformatics. 2010 Jun 15;26(12):1569-71. doi: 10.1093/bioinformatics/btq228. Epub 2010 Apr 25.

An Introduction to Programming for Bioscientists: A Python-Based Primer.

PLoS Comput Biol. 2016 Jun 7;12(6):e1004867. doi: 10.1371/journal.pcbi.1004867. eCollection 2016 Jun.

Biopython: freely available Python tools for computational molecular biology and bioinformatics.

Bioinformatics. 2009 Jun 1;25(11):1422-3. doi: 10.1093/bioinformatics/btp163. Epub 2009 Mar 20.

A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.

BMC Bioinformatics. 2017 Mar 17;18(1):175. doi: 10.1186/s12859-017-1580-5.

p3d--Python module for structural bioinformatics.

BMC Bioinformatics. 2009 Aug 21;10:258. doi: 10.1186/1471-2105-10-258.

引用本文的文献

PDBCharges: Quantum-Mechanical Partial Atomic Charges for PDB Structures.

Nucleic Acids Res. 2025 Jul 7;53(W1):W457-W462. doi: 10.1093/nar/gkaf401.

PRESCOTT: a population aware, epistatic, and structural model accurately predicts missense effects.

Genome Biol. 2025 May 6;26(1):113. doi: 10.1186/s13059-025-03581-y.

Molecular mechanism of Mad2 conformational conversion promoted by the Mad2-interaction motif of Cdc20.

Protein Sci. 2025 Apr;34(4):e70099. doi: 10.1002/pro.70099.

FtsB and PerM interact via a C-terminal helix in FtsB to modulate cell division.

J Bacteriol. 2025 Apr 17;207(4):e0044424. doi: 10.1128/jb.00444-24. Epub 2025 Mar 26.

Isopeptor: a tool for detecting intramolecular isopeptide bonds in protein structures.

Bioinform Adv. 2025 Mar 11;5(1):vbaf049. doi: 10.1093/bioadv/vbaf049. eCollection 2025.

Multiple Chaperone DnaK-FliC Flagellin Interactions are Required for Pseudomonas aeruginosa Flagellum Assembly and Indicate a New Function for DnaK.

Microb Biotechnol. 2025 Feb;18(2):e70096. doi: 10.1111/1751-7915.70096.

Systematic analysis of biomolecular conformational ensembles with PENSA.

J Chem Phys. 2025 Jan 7;162(1). doi: 10.1063/5.0235544.

Bilingual language model for protein sequence and structure.

NAR Genom Bioinform. 2024 Nov 15;6(4):lqae150. doi: 10.1093/nargab/lqae150. eCollection 2024 Dec.

Annotation Vocabulary (Might Be) All You Need.

bioRxiv. 2024 Jul 31:2024.07.30.605924. doi: 10.1101/2024.07.30.605924.

Male and female contributions to diversity among birdwing butterfly images.

Commun Biol. 2024 Jul 1;7(1):774. doi: 10.1038/s42003-024-06376-2.

本文引用的文献

MMTF-An efficient file format for the transmission, visualization, and analysis of macromolecular structures.

PLoS Comput Biol. 2017 Jun 2;13(6):e1005575. doi: 10.1371/journal.pcbi.1005575. eCollection 2017 Jun.

Addressing inaccuracies in BLOSUM computation improves homology search performance.

BMC Bioinformatics. 2016 Apr 27;17:189. doi: 10.1186/s12859-016-1060-3.

MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories.

Biophys J. 2015 Oct 20;109(8):1528-32. doi: 10.1016/j.bpj.2015.08.015.

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Mol Syst Biol. 2011 Oct 11;7:539. doi: 10.1038/msb.2011.75.

A short survey on protein blocks.

Biophys Rev. 2010 Aug;2(3):137-147. doi: 10.1007/s12551-010-0036-1. Epub 2010 Aug 5.

MDAnalysis: a toolkit for the analysis of molecular dynamics simulations.

J Comput Chem. 2011 Jul 30;32(10):2319-27. doi: 10.1002/jcc.21787. Epub 2011 Apr 15.

Biopython: freely available Python tools for computational molecular biology and bioinformatics.

Bioinformatics. 2009 Jun 1;25(11):1422-3. doi: 10.1093/bioinformatics/btp163. Epub 2009 Mar 20.

PhAST: pharmacophore alignment search tool.

J Comput Chem. 2009 Apr 15;30(5):761-71. doi: 10.1002/jcc.21095.

PyCogent: a toolkit for making sense from sequence.

Genome Biol. 2007;8(8):R171. doi: 10.1186/gb-2007-8-8-r171.

MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. Print 2004.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

黑云母：一个用 Python 实现的统一的开源计算生物学框架。

Biotite: a unifying open source computational biology framework in Python.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献