Suppr超能文献

一种基于直方图校正的高精度基因组大小估计器。

A high-precision genome size estimator based on the histogram correction.

作者信息

Liao Xiangyu, Zhu Wufei, Liu Chaoyun

机构信息

Department of Oncology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, Yichang, China.

Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, Yichang, China.

出版信息

Front Genet. 2024 Aug 22;15:1451730. doi: 10.3389/fgene.2024.1451730. eCollection 2024.

Abstract

INTRODUCTION

In the realm of next-generation sequencing datasets, various characteristics can be extracted through based analysis. Among these characteristics, genome size (GS) is one that can be estimated with relative ease, yet achieving satisfactory accuracy, especially in the context of heterozygosity, remains a challenge.

METHODS

In this study, we introduce a high-precision genome size estimator, (Genome Size Estimation Tool), which is based on histogram correction.

RESULTS

We have evaluated on both simulated and real datasets. The experimental results demonstrate that this tool can estimate genome size with greater precision, even surpassing the accuracy of state-of-the-art tools. Notably, GSET also performs satisfactorily on heterozygous datasets, where other tools struggle to produce useable results.

DISCUSSION

The processing model of diverges from the popular data fitting models used by similar tools. Instead, it is derived from empirical data and incorporates a correction term to mitigate the impact of sequencing errors on genome size estimation. is freely available for use and can be accessed at the following URL: https://github.com/Xingyu-Liao/GSET.

摘要

引言

在下一代测序数据集领域,可以通过基于[具体内容缺失]的分析提取各种特征。在这些特征中,基因组大小(GS)是相对容易估计的一个,但要达到令人满意的准确性,尤其是在杂合性背景下,仍然是一个挑战。

方法

在本研究中,我们引入了一种高精度基因组大小估计工具[具体名称缺失](基因组大小估计工具),它基于[具体内容缺失]直方图校正。

结果

我们在模拟数据集和真实数据集上对[具体名称缺失]进行了评估。实验结果表明,该工具能够以更高的精度估计基因组大小,甚至超过了现有最先进工具的准确性。值得注意的是,GSET在杂合数据集上也表现出色,而其他工具在这类数据集上难以产生可用的结果。

讨论

[具体名称缺失]的处理模型与类似工具使用的流行数据拟合模型不同。相反,它源自经验数据,并纳入了一个校正项,以减轻测序错误对基因组大小估计的影响。[具体名称缺失]可免费使用,可通过以下网址访问:https://github.com/Xingyu-Liao/GSET。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5a3/11374637/724f8776149e/fgene-15-1451730-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验