Chicco Davide
Princess Margaret Cancer Centre, PMCR Tower 11-401, 101 College Street, Toronto, Ontario, M5G 1L7 Canada.
BioData Min. 2017 Dec 8;10:35. doi: 10.1186/s13040-017-0155-3. eCollection 2017.
Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experience to run a data mining project effectively, and therefore can follow incorrect practices, that may lead to common mistakes or over-optimistic results. With this review, we present ten quick tips to take advantage of machine learning in any computational biology context, by avoiding some common errors that we observed hundreds of times in multiple bioinformatics projects. We believe our ten suggestions can strongly help any machine learning practitioner to carry on a successful project in computational biology and related sciences.
机器学习已成为计算生物学、生物信息学和健康信息学中许多项目的关键工具。然而,初学者和生物医学研究人员通常没有足够的经验来有效地开展数据挖掘项目,因此可能会遵循错误的做法,这可能导致常见错误或过于乐观的结果。通过本综述,我们提出十条快速提示,以在任何计算生物学背景下利用机器学习,避免我们在多个生物信息学项目中数百次观察到的一些常见错误。我们相信我们的十条建议能有力地帮助任何机器学习从业者在计算生物学及相关科学领域开展成功的项目。