Lee Sugi, Jung Minah, Jung Jaeeun, Park Kunhyang, Ryu Jea-Woon, Kim Jeongkil, Kim Dae-Soo
Department of Bioinformatics, KRIBB School of Bioscience, Korea University of Science and Technology(UST), Daejeon, Korea.
Department of Rare Disease Research Center, Korea Research Institute of Bioscience & Biotechnology(KRIBB), Daejeon, Korea.
PLoS One. 2017 Sep 28;12(9):e0185514. doi: 10.1371/journal.pone.0185514. eCollection 2017.
Whole-exome sequencing (WES) can identify causative mutations in hereditary diseases. However, WES data might have a large candidate variant list, including false positives. Moreover, in families, it is more difficult to select disease-associated variants because many variants are shared among members. To reduce false positives and extract accurate candidates, we used a multilocus variant instead of a single-locus variant (SNV). We set up a specific window to analyze the multilocus variant and devised a sliding-window approach to observe all variants. We developed the gene selection tool (GST) based on proportion tests for linkage analysis using WES data. This tool is R program coded and has high sensitivity. We tested our code to find the gene for hereditary spastic paraplegia using SNVs from a specific family and identified the gene known to cause the disease in a significant gene list. The list identified other genes that might be associated with the disease.
全外显子组测序(WES)能够识别遗传性疾病中的致病突变。然而,WES数据可能会有一个庞大的候选变异列表,包括假阳性。此外,在家族中,选择与疾病相关的变异更加困难,因为许多变异在成员之间是共享的。为了减少假阳性并提取准确的候选者,我们使用多位点变异而非单一位点变异(单核苷酸变异,SNV)。我们设置了一个特定窗口来分析多位点变异,并设计了一种滑动窗口方法来观察所有变异。我们基于使用WES数据进行连锁分析的比例检验开发了基因选择工具(GST)。该工具用R程序编码,具有很高的灵敏度。我们使用来自一个特定家族的单核苷酸变异测试了我们的代码,以寻找遗传性痉挛性截瘫的致病基因,并在一个显著的基因列表中鉴定出了已知导致该疾病的基因。该列表还识别出了其他可能与该疾病相关的基因。