Hubley Robert M, Zitzler Eckart, Roach Jared C
Institute for Systems Biology, Seattle, WA, USA.
BMC Bioinformatics. 2003 Jul 23;4:30. doi: 10.1186/1471-2105-4-30.
Large databases of single nucleotide polymorphisms (SNPs) are available for use in genomics studies. Typically, investigators must choose a subset of SNPs from these databases to employ in their studies. The choice of subset is influenced by many factors, including estimated or known reliability of the SNP, biochemical factors, intellectual property, cost, and effectiveness of the subset for mapping genes or identifying disease loci. We present an evolutionary algorithm for multiobjective SNP selection.
We implemented a modified version of the Strength-Pareto Evolutionary Algorithm (SPEA2) in Java. Our implementation, Multiobjective Analyzer for Genetic Marker Acquisition (MAGMA), approximates the set of optimal trade-off solutions for large problems in minutes. This set is very useful for the design of large studies, including those oriented towards disease identification, genetic mapping, population studies, and haplotype-block elucidation.
Evolutionary algorithms are particularly suited for optimization problems that involve multiple objectives and a complex search space on which exact methods such as exhaustive enumeration cannot be applied. They provide flexibility with respect to the problem formulation if a problem description evolves or changes. Results are produced as a trade-off front, allowing the user to make informed decisions when prioritizing factors. MAGMA is open source and available at http://snp-magma.sourceforge.net. Evolutionary algorithms are well suited for many other applications in genomics.
单核苷酸多态性(SNP)的大型数据库可用于基因组学研究。通常,研究人员必须从这些数据库中选择SNP子集用于他们的研究。子集的选择受许多因素影响,包括SNP的估计或已知可靠性、生化因素、知识产权、成本以及该子集用于基因定位或识别疾病位点的有效性。我们提出了一种用于多目标SNP选择的进化算法。
我们用Java实现了强度帕累托进化算法(SPEA2)的一个修改版本。我们的实现,即遗传标记获取多目标分析器(MAGMA),能在几分钟内近似得出针对大问题的最优权衡解决方案集。该集合对于大型研究的设计非常有用,包括那些针对疾病识别、基因定位、群体研究和单倍型块阐释的研究。
进化算法特别适用于涉及多个目标以及具有复杂搜索空间的优化问题,在这种搜索空间上无法应用诸如穷举枚举等精确方法。如果问题描述发生演变或变化,它们在问题表述方面提供了灵活性。结果以权衡前沿的形式产生,允许用户在对因素进行优先级排序时做出明智的决策。MAGMA是开源的,可在http://snp-magma.sourceforge.net获取。进化算法非常适合基因组学中的许多其他应用。