Fukuda Yoko, Nakahara Yasuo, Date Hidetoshi, Takahashi Yuji, Goto Jun, Miyashita Akinori, Kuwano Ryozo, Adachi Hiroki, Nakamura Eiji, Tsuji Shoji
Department of Neurology, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan.
BMC Bioinformatics. 2009 Apr 24;10:121. doi: 10.1186/1471-2105-10-121.
During this recent decade, microarray-based single nucleotide polymorphism (SNP) data are becoming more widely used as markers for linkage analysis in the identification of loci for disease-associated genes. Although microarray-based SNP analyses have markedly reduced genotyping time and cost compared with microsatellite-based analyses, applying these enormous data to linkage analysis programs is a time-consuming step, thus, necessitating a high-throughput platform.
We have developed SNP HiTLink (SNP High Throughput Linkage analysis system). In this system, SNP chip data of the Affymetrix Mapping 100 k/500 k array set and Genome-Wide Human SNP array 5.0/6.0 can be directly imported and passed to parametric or model-free linkage analysis programs; MLINK, Superlink, Merlin and Allegro. Various marker-selecting functions are implemented to avoid the effect of typing-error data, markers in linkage equilibrium or to select informative data.
The results using the 100 k SNP dataset were comparable or even superior to those obtained from analyses using microsatellite markers in terms of LOD scores obtained. General personal computers are sufficient to execute the process, as runtime for whole-genome analysis was less than a few hours. This system can be widely applied to linkage analysis using microarray-based SNP data and with which one can expect high-throughput and reliable linkage analysis.
在最近十年间,基于微阵列的单核苷酸多态性(SNP)数据作为连锁分析的标记,在疾病相关基因位点的鉴定中得到了越来越广泛的应用。尽管与基于微卫星的分析相比,基于微阵列的SNP分析显著减少了基因分型时间和成本,但将这些海量数据应用于连锁分析程序是一个耗时的步骤,因此需要一个高通量平台。
我们开发了SNP HiTLink(SNP高通量连锁分析系统)。在该系统中,Affymetrix Mapping 100 k/500 k阵列集和全基因组人类SNP阵列5.0/6.0的SNP芯片数据可以直接导入,并传递给参数化或无模型连锁分析程序;MLINK、Superlink、Merlin和Allegro。实现了各种标记选择功能,以避免分型错误数据、连锁平衡中的标记的影响或选择信息性数据。
就获得的LOD分数而言,使用100 k SNP数据集的结果与使用微卫星标记分析获得的结果相当,甚至更优。普通个人计算机足以执行该过程,因为全基因组分析的运行时间不到几个小时。该系统可广泛应用于基于微阵列SNP数据的连锁分析,有望实现高通量和可靠的连锁分析。