文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, MA 02215, USA.

Department of Data Science, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA.

ArXiv. 2025 Aug 8:arXiv:2507.03718v2.

BACKGROUND

While benchmarks on short-read variant calling suggest low error rate below 0.5%, they are only applicable to predefined confident regions. For a human sample without such regions, the error rate could be 10 times higher. Although multiple sets of easy regions have been identified to alleviate the issue, they fail to consider non-reference samples or are biased towards existing short-read data or aligners.

RESULTS

Here, using hundreds of high-quality human assemblies, we derived a set of sample-agnostic easy regions where short-read variant calling reaches high accuracy. These regions cover 88.2% of GRCh38, 92.2% of coding regions and 96.3% of ClinVar pathogenic variants. They achieve a good balance between coverage and easiness and can be generated for other human assemblies or species with multiple well assembled genomes.

CONCLUSION

This resource provides a convient and powerful way to filter spurious variant calls for clinical or research human samples.

背景

虽然短读长变异检测的基准表明错误率低于0.5%，但它们仅适用于预定义的可靠区域。对于没有此类区域的人类样本，错误率可能高出10倍。尽管已经确定了多组容易区域来缓解这个问题，但它们没有考虑非参考样本，或者偏向于现有的短读长数据或比对器。

结果

在这里，我们使用数百个高质量的人类基因组组装，得出了一组与样本无关的容易区域，在这些区域短读长变异检测可达到高精度。这些区域覆盖了GRCh38的88.2%、编码区域的92.2%和ClinVar致病性变异的96.3%。它们在覆盖范围和易处理性之间实现了良好的平衡，并且可以为其他人类基因组组装或具有多个良好组装基因组的物种生成。

结论

该资源为过滤临床或研究用人类样本中的虚假变异检测提供了一种方便且强大的方法。

Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, MA 02215, USA.

Department of Data Science, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA.

ArXiv. 2025 Aug 8:arXiv:2507.03718v2.

BACKGROUND

RESULTS

CONCLUSION

This resource provides a convient and powerful way to filter spurious variant calls for clinical or research human samples.

背景

结果

结论

该资源为过滤临床或研究用人类样本中的虚假变异检测提供了一种方便且强大的方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

从泛基因组数据中寻找易于进行短读变异检测的区域。

Finding easy regions for short-read variant calling from pangenome data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

本文引用的文献

从泛基因组数据中寻找易于进行短读变异检测的区域。

Finding easy regions for short-read variant calling from pangenome data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

本文引用的文献