Suppr超能文献

GRIEVOUS:用于解决跨数据集基因型不一致问题的命令行通用工具。

GRIEVOUS: your command-line general for resolving cross-dataset genotype inconsistencies.

机构信息

Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA 92093, United States.

Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, United States.

出版信息

Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae489.

Abstract

SUMMARY

Harmonizing variant indexing and allele assignments across datasets is crucial for data integrity in cross-dataset studies such as multi-cohort genome-wide association studies, meta-analyses, and the development, validation, and application of polygenic risk scores. Ensuring this indexing and allele consistency is a laborious, time-consuming, and error-prone process requiring a certain degree of computational proficiency. Here, we introduce GRIEVOUS, a command-line tool for cross-dataset variant homogenization. By means of an internal database and a custom indexing methodology, GRIEVOUS identifies, formats, and aligns all biallelic single nucleotide polymorphisms (SNPs) across all summary statistic and genotype files of interest. Upon completion of dataset harmonization, GRIEVOUS can also be used to extract the maximal set of biallelic SNPs common to all datasets.

AVAILABILITY AND IMPLEMENTATION

GRIEVOUS and all supporting documentation and tutorials can be found at https://github.com/jvtalwar/GRIEVOUS. It is freely and publicly available under the MIT license and can be installed via pip.

摘要

摘要

在跨数据集研究(如多队列全基因组关联研究、荟萃分析,以及多基因风险评分的开发、验证和应用)中,协调变体索引和等位基因赋值对于数据完整性至关重要。确保这种索引和等位基因一致性是一个繁琐、耗时且容易出错的过程,需要一定程度的计算能力。在这里,我们介绍了 GRIEVOUS,这是一种用于跨数据集变体同质化的命令行工具。通过内部数据库和自定义索引方法,GRIEVOUS 可以识别、格式化和对齐所有感兴趣的汇总统计和基因型文件中的所有双等位基因单核苷酸多态性(SNP)。完成数据集协调后,GRIEVOUS 还可用于提取所有数据集共有的最大双等位基因 SNP 集。

可用性和实现

GRIEVOUS 及其所有支持文档和教程都可以在 https://github.com/jvtalwar/GRIEVOUS 上找到。它根据 MIT 许可证免费公开提供,并可通过 pip 进行安装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dd7/11322043/fa1f6655a4d3/btae489f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验