Haghani Maryam, Bhattacharya Debswapna, Murali T M
Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States of America.
Bioinformatics. 2025 Jun 3. doi: 10.1093/bioinformatics/btaf222.
A Multiple Sequence Alignment (MSA) contains fundamental evolutionary information that is useful in the prediction of structure and function of proteins and nucleic acids. The "Number of Effective Sequences" (NEFF) quantifies the diversity of sequences of an MSA. While several tools embed NEFF calculation with various options, none are standalone tools for this purpose, and they do not offer all the available options.
We developed NEFFy, the first software package to integrate all these options and calculate NEFF across diverse MSA formats for proteins, RNAs, and DNAs. It surpasses existing tools in functionality without compromising computational efficiency and scalability. NEFFy also offers per-residue NEFF calculation and supports NEFF computation for MSAs of multimeric proteins, with the capability to be extended to DNAs and RNAs.
NEFFy is released as open-source software under the GNU Public License v3.0. The source code in C ++ and a Python wrapper are available at https://github.com/Maryam-Haghani/NEFFy. To ensure users can fully leverage these capabilities, comprehensive documentation and examples are provided at https://Maryam-Haghani.github.io/NEFFy.
Supplementary data are available at Bioinformatics online.
多序列比对(MSA)包含了重要的进化信息,这些信息在预测蛋白质和核酸的结构与功能方面很有用。“有效序列数”(NEFF)对MSA序列的多样性进行量化。虽然有几个工具嵌入了带有各种选项的NEFF计算,但没有一个是用于此目的的独立工具,并且它们没有提供所有可用选项。
我们开发了NEFFy,这是第一个集成所有这些选项并针对蛋白质、RNA和DNA的多种MSA格式计算NEFF的软件包。它在功能上超越了现有工具,同时不影响计算效率和可扩展性。NEFFy还提供每个残基的NEFF计算,并支持对多聚体蛋白质的MSA进行NEFF计算,且有扩展到DNA和RNA的能力。
NEFFy作为开源软件根据GNU通用公共许可证v3.0发布。C++源代码和Python包装器可在https://github.com/Maryam-Haghani/NEFFy获取。为确保用户能够充分利用这些功能,在https://Maryam-Haghani.github.io/NEFFy提供了全面的文档和示例。
补充数据可在《生物信息学》在线获取。