Suppr超能文献

dipwmsearch:一个用于搜索双 PWM 基序的 Python 包。

dipwmsearch: a Python package for searching di-PWM motifs.

机构信息

LIRMM, Univ Montpellier, CNRS, Montpellier, France.

Institut Français de Bioinformatique, CNRS UAR 3601, Évry, France.

出版信息

Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad141.

Abstract

MOTIVATION

Seeking probabilistic motifs in a sequence is a common task to annotate putative transcription factor binding sites or other RNA/DNA binding sites. Useful motif representations include position weight matrices (PWMs), dinucleotide PWMs (di-PWMs), and hidden Markov models (HMMs). Dinucleotide PWMs not only combine the simplicity of PWMs-a matrix form and a cumulative scoring function-but also incorporate dependency between adjacent positions in the motif (unlike PWMs which disregard any dependency). For instance to represent binding sites, the HOCOMOCO database provides di-PWM motifs derived from experimental data. Currently, two programs, SPRy-SARUS and MOODS, can search for occurrences of di-PWMs in sequences.

RESULTS

We propose a Python package called dipwmsearch, which provides an original and efficient algorithm for this task (it first enumerates matching words for the di-PWM, and then searches these all at once in the sequence, even if the latter contains IUPAC codes). The user benefits from an easy installation via Pypi or conda, a comprehensive documentation, and executable scripts that facilitate the use of di-PWMs.

AVAILABILITY AND IMPLEMENTATION

dipwmsearch is available at https://pypi.org/project/dipwmsearch/ and https://gite.lirmm.fr/rivals/dipwmsearch/ under Cecill license.

摘要

动机

在序列中寻找概率基序是注释假定转录因子结合位点或其他 RNA/DNA 结合位点的常见任务。有用的基序表示形式包括位置权重矩阵 (PWMs)、二核苷酸 PWMs (di-PWMs) 和隐马尔可夫模型 (HMMs)。二核苷酸 PWMs 不仅结合了 PWM 的简单性——矩阵形式和累积评分函数,而且还包含基序中相邻位置之间的依赖性(与 PWM 不同,PWM 忽略任何依赖性)。例如,为了表示结合位点,HOCOMOCO 数据库提供了来自实验数据的 di-PWM 基序。目前,有两个程序,SPRy-SARUS 和 MOODS,可以在序列中搜索 di-PWM 的出现。

结果

我们提出了一个名为 dipwmsearch 的 Python 包,它为这项任务提供了一种原始而有效的算法(它首先为 di-PWM 枚举匹配的单词,然后在序列中一次性搜索这些单词,即使后者包含 IUPAC 代码)。用户可以通过 Pypi 或 conda 轻松安装,文档全面,并且可执行脚本简化了 di-PWM 的使用。

可用性和实现

dipwmsearch 可在 https://pypi.org/project/dipwmsearch/ 和 Cecill 许可证下的 https://gite.lirmm.fr/rivals/dipwmsearch/ 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c72b/10081870/c111aa21c787/btad141f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验