使用特征选择来选择单核苷酸多态性（SNPs）。

Choosing SNPs using feature selection.

作者信息

Phuong Tu Minh, Lin Zhen, Altman Russ B

机构信息

Department of Information Technology, Post & Telecom. Institute of Technology, Hanoi, Vietnam.

出版信息

Proc IEEE Comput Syst Bioinform Conf. 2005:301-9. doi: 10.1109/csb.2005.22.

DOI:10.1109/csb.2005.22

PMID:16447987

Abstract

A major challenge for genomewide disease association studies is the high cost of genotyping large number of single nucleotide polymorphisms (SNP). The correlations between SNPs, however, make it possible to select a parsimonious set of informative SNPs, known as "tagging" SNPs, able to capture most variation in a population. Considerable research interest has recently focused on the development of methods for finding such SNPs. In this paper, we present an efficient method for finding tagging SNPs. The method does not involve computation-intensive search for SNP subsets but discards redundant SNPs using a feature selection algorithm. In contrast to most existing methods, the method presented here does not limit itself to using only correlations between SNPs in local groups. By using correlations that occur across different chromosomal regions, the method can reduce the number of globally redundant SNPs. Experimental results show that the number of tagging SNPs selected by our method is smaller than by using block-based methods.

摘要

全基因组疾病关联研究面临的一个主要挑战是对大量单核苷酸多态性（SNP）进行基因分型的成本高昂。然而，SNP之间的相关性使得选择一组简约的信息丰富的SNP（即所谓的“标签”SNP）成为可能，这些SNP能够捕获群体中的大部分变异。最近，大量研究兴趣集中在寻找此类SNP的方法开发上。在本文中，我们提出了一种寻找标签SNP的有效方法。该方法不涉及对SNP子集进行计算密集型搜索，而是使用特征选择算法丢弃冗余SNP。与大多数现有方法不同，本文提出的方法不仅限于使用局部组内SNP之间的相关性。通过使用不同染色体区域间出现的相关性，该方法可以减少全局冗余SNP的数量。实验结果表明，我们的方法选择的标签SNP数量比基于模块的方法更少。

相似文献

Choosing SNPs using feature selection.使用特征选择来选择单核苷酸多态性（SNPs）。

Proc IEEE Comput Syst Bioinform Conf. 2005:301-9. doi: 10.1109/csb.2005.22.

Choosing SNPs using feature selection.使用特征选择来选择单核苷酸多态性（SNPs）。

J Bioinform Comput Biol. 2006 Apr;4(2):241-57. doi: 10.1142/s0219720006001941.

Informative SNP selection methods based on SNP prediction.基于单核苷酸多态性（SNP）预测的信息性SNP选择方法。

IEEE Trans Nanobioscience. 2007 Mar;6(1):60-7. doi: 10.1109/tnb.2007.891901.

An efficient comprehensive search algorithm for tagSNP selection using linkage disequilibrium criteria.一种使用连锁不平衡标准进行标签单核苷酸多态性选择的高效综合搜索算法。

Bioinformatics. 2006 Jan 15;22(2):220-5. doi: 10.1093/bioinformatics/bti762. Epub 2005 Nov 3.

Effective algorithms for tag SNP selection.用于标签单核苷酸多态性选择的有效算法。

J Bioinform Comput Biol. 2005 Oct;3(5):1089-106. doi: 10.1142/s0219720005001521.

BNTagger: improved tagging SNP selection using Bayesian networks.BNTagger：使用贝叶斯网络改进标签单核苷酸多态性选择

Bioinformatics. 2006 Jul 15;22(14):e211-9. doi: 10.1093/bioinformatics/btl233.

FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium.FastTagger：一种利用多标记连锁不平衡进行全基因组标签 SNP 选择的高效算法。

BMC Bioinformatics. 2010 Jan 29;11:66. doi: 10.1186/1471-2105-11-66.

Snagger: a user-friendly program for incorporating additional information for tagSNP selection.Snagger：一个用于为标签单核苷酸多态性选择整合额外信息的用户友好型程序。

BMC Bioinformatics. 2008 Mar 27;9:174. doi: 10.1186/1471-2105-9-174.

LdCompare: rapid computation of single- and multiple-marker r2 and genetic coverage.LdCompare：单标记和多标记r2及遗传覆盖率的快速计算

Bioinformatics. 2007 Jan 15;23(2):252-4. doi: 10.1093/bioinformatics/btl574. Epub 2006 Dec 5.

Effective selection of informative SNPs and classification on the HapMap genotype data.对HapMap基因型数据进行信息性单核苷酸多态性（SNP）的有效选择和分类。

BMC Bioinformatics. 2007 Dec 20;8:484. doi: 10.1186/1471-2105-8-484.

引用本文的文献

Fair molecular feature selection unveils universally tumor lineage-informative methylation sites in colorectal cancer.合理的分子特征选择揭示了结直肠癌中普遍存在的肿瘤谱系信息性甲基化位点。

Bioinformatics. 2025 Jul 1;41(Supplement_1):i150-i159. doi: 10.1093/bioinformatics/btaf237.

Placenta Accreta Spectrum and Hysterectomy Prediction Using MRI Radiomic Features.使用MRI影像组学特征预测胎盘植入谱系疾病和子宫切除术

Proc SPIE Int Soc Opt Eng. 2022 Feb-Mar;12033. doi: 10.1117/12.2611587. Epub 2022 Apr 4.

Detecting Aggressive Papillary Thyroid Carcinoma Using Hyperspectral Imaging and Radiomic Features.利用高光谱成像和放射组学特征检测侵袭性乳头状甲状腺癌。

Proc SPIE Int Soc Opt Eng. 2022 Feb-Mar;12033. doi: 10.1117/12.2611842. Epub 2022 Apr 4.

Utilising Flow Aggregation to Classify Benign Imitating Attacks.利用流量聚合对良性模仿攻击进行分类。

Sensors (Basel). 2021 Mar 4;21(5):1761. doi: 10.3390/s21051761.

Feature Selection Stability and Accuracy of Prediction Models for Genomic Prediction of Residual Feed Intake in Pigs Using Machine Learning.使用机器学习对猪的剩余采食量进行基因组预测的预测模型的特征选择稳定性和准确性

Front Genet. 2021 Feb 22;12:611506. doi: 10.3389/fgene.2021.611506. eCollection 2021.

A data driven methodology for social science research with left-behind children as a case study.以留守儿童为案例研究的社会科学研究数据驱动方法。

PLoS One. 2020 Nov 20;15(11):e0242483. doi: 10.1371/journal.pone.0242483. eCollection 2020.

WERFE: A Gene Selection Algorithm Based on Recursive Feature Elimination and Ensemble Strategy.WERFE：一种基于递归特征消除和集成策略的基因选择算法。

Front Bioeng Biotechnol. 2020 May 28;8:496. doi: 10.3389/fbioe.2020.00496. eCollection 2020.

Big data in IBD: big progress for clinical practice.炎症性肠病中的大数据：临床实践的重大进展。

Gut. 2020 Aug;69(8):1520-1532. doi: 10.1136/gutjnl-2019-320065. Epub 2020 Feb 28.

CT radiomics may predict the grade of pancreatic neuroendocrine tumors: a multicenter study.CT 放射组学可预测胰腺神经内分泌肿瘤的分级：一项多中心研究。

Eur Radiol. 2019 Dec;29(12):6880-6890. doi: 10.1007/s00330-019-06176-x. Epub 2019 Jun 21.

Application of a spatially-weighted Relief algorithm for ranking genetic predictors of disease.基于空间加权 Relief 算法的疾病遗传预测因子排序应用

BioData Min. 2012 Dec 3;5(1):20. doi: 10.1186/1756-0381-5-20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用特征选择来选择单核苷酸多态性（SNPs）。

Choosing SNPs using feature selection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献