StrandScript：评估 Illumina 基因分型阵列设计和链校正。

StrandScript: evaluation of Illumina genotyping array design and strand correction.

机构信息

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville TN, USA 37232.

Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville TN, USA, 37232.

出版信息

Bioinformatics. 2017 Aug 1;33(15):2399-2401. doi: 10.1093/bioinformatics/btx186.

DOI:10.1093/bioinformatics/btx186

PMID:28402386

Abstract

SUMMARY

After the introduction of high-throughput sequencing, genotyping arrays continue to be a viable source for conducting large-scale genetic studies. Currently, Illumina is one of the largest genotyping array manufacturers. One technical issue that has always plagued the post-processing of Illumina genotyping array data is the strand definition. Against convention, Illumina uses their own definition of strand, which is inconsistent with the standard reference forward and reverse definition. This issue has been a major obstacle in the consistency of reporting, meta-analysis and correct interpretation of phenotype association results. To date, the strand issue has not been adequately addressed, prompting us to develop StrandScript, a tool that can convert all genotyping data generated from Illumina genotyping arrays to the reference forward strand. StrandScript works independently of the Illumina array version and is future proof for newer Illumina array designs. Furthermore, StrandScript can examine an Illumina genotyping array manifest file and can detect all problematic SNPs, including SNPs with wrong RS ID and SNPs with mismatched probe sequences. Here, we introduce StrandScript's design and development, and demonstrate its effectiveness using real genotyping data.

AVAILABILITY AND IMPLEMENTATION

https://github.com/seasky002002/Strandscript.

CONTACT

yan.guo.1@vanderbilt.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

高通量测序技术问世后，基因分型芯片仍然是进行大规模遗传研究的一种可行资源。目前，Illumina 是最大的基因分型芯片制造商之一。Illumina 基因分型芯片数据后处理一直存在一个技术问题，即链定义。与常规做法相反，Illumina 使用自己定义的链，与标准参考正向和反向定义不一致。这个问题一直是报告一致性、荟萃分析和正确解释表型关联结果的主要障碍。迄今为止，这个链的问题尚未得到充分解决，促使我们开发了 StrandScript，这是一种可以将所有来自 Illumina 基因分型芯片的基因分型数据转换为参考正向链的工具。StrandScript 独立于 Illumina 芯片版本工作，并且为较新的 Illumina 芯片设计提供了未来保障。此外，StrandScript 可以检查 Illumina 基因分型芯片清单文件，并可以检测所有有问题的 SNP，包括 RS ID 错误的 SNP 和探针序列不匹配的 SNP。在这里，我们介绍了 StrandScript 的设计和开发，并使用真实的基因分型数据展示了它的有效性。

可用性和实现

https://github.com/seasky002002/Strandscript。

联系方式

yan.guo.1@vanderbilt.edu。

补充信息

补充数据可在 Bioinformatics 在线获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

StrandScript：评估 Illumina 基因分型阵列设计和链校正。

StrandScript: evaluation of Illumina genotyping array design and strand correction.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

摘要

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

StrandScript：评估 Illumina 基因分型阵列设计和链校正。

StrandScript: evaluation of Illumina genotyping array design and strand correction.

机构信息

出版信息

SUMMARY

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

摘要

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献