Du Lianming, Sun Dalin, Chen Jiahao, Zhou Xinyi, Zhao Kelei, Zeng Qianglin, Yang Nan
Antibiotics Research and Re-Evaluation Key Laboratory of Sichuan Province, Institute for Advanced Study, Chengdu University, Chengdu, 610106, China.
Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, 610225, China.
BMC Bioinformatics. 2025 Jun 4;26(1):151. doi: 10.1186/s12859-025-06168-3.
BACKGROUND: Tandem repeats (TRs) are major sources of genetic variation and important genetic markers. Their expansions are not only involved in gene expression regulation but also associated with many nervous system diseases and cancers. However, there is a lack of an efficient tandem repeat identification tool for seamless integration with larger bioinformatics programs developed with the popular Python language. RESULTS: We introduce pytrf, a Python package for identification of both exact and approximate TRs from genomic sequences. It allows seamless embedding into other programs developed by Python or using in Python interactive environment and Jupyter notebooks. It also provides command line tools for assisting users to find tandem repeats from FASTA/Q files. Compared to other tools, the pytrf shows the highest performance in aspect of running time with comparable peak memory usage. CONCLUSIONS: Pytrf provides simple interfaces and command line tools to facilitate identification of tandem repeats from genomic sequences. Pytrf can easily be installed from PyPI ( https://pypi.org/project/pytrf ) and the source code is freely available at https://github.com/lmdu/pytrf .
背景:串联重复序列(TRs)是遗传变异的主要来源和重要的遗传标记。它们的扩增不仅参与基因表达调控,还与许多神经系统疾病和癌症相关。然而,缺乏一种能与使用流行的Python语言开发的更大的生物信息学程序无缝集成的高效串联重复序列识别工具。 结果:我们引入了pytrf,一个用于从基因组序列中识别精确和近似TRs的Python包。它允许无缝嵌入到由Python开发的其他程序中,或在Python交互式环境和Jupyter笔记本中使用。它还提供命令行工具,以协助用户从FASTA/Q文件中查找串联重复序列。与其他工具相比,pytrf在运行时间方面表现出最高性能,峰值内存使用量相当。 结论:Pytrf提供了简单的接口和命令行工具,便于从基因组序列中识别串联重复序列。Pytrf可以很容易地从PyPI(https://pypi.org/project/pytrf)安装,并且源代码可在https://github.com/lmdu/pytrf免费获取。
BMC Bioinformatics. 2025-6-4
Bioinformatics. 2023-1-1
BMC Genomics. 2023-10-3
Biochem Mol Biol Educ. 2022-9
Bioinformatics. 2021-5-5
Bioinformatics. 2018-2-15
Bioinform Adv. 2024-10-9
Nat Rev Genet. 2024-7
Nat Commun. 2023-10-23
Nat Commun. 2023-9-14
Elife. 2023-3-20
Biosystems. 2023-4
Nature. 2023-1