Oneto Luca, Chicco Davide
Dipartimento di Informatica Bioingegneria Robotica e Ingegneria dei Sistemi, Università di Genova, Genoa, Italy.
Dipartimento di Informatica Sistemistica e Comunicazione, Università di Milano-Bicocca, Milan, Italy.
PLoS Comput Biol. 2025 Jan 9;21(1):e1012711. doi: 10.1371/journal.pcbi.1012711. eCollection 2025 Jan.
Machine learning has become a powerful tool for computational analysis in the biomedical sciences, with its effectiveness significantly enhanced by integrating domain-specific knowledge. This integration has give rise to informed machine learning, in contrast to studies that lack domain knowledge and treat all variables equally (uninformed machine learning). While the application of informed machine learning to bioinformatics and health informatics datasets has become more seamless, the likelihood of errors has also increased. To address this drawback, we present eight guidelines outlining best practices for employing informed machine learning methods in biomedical sciences. These quick tips offer recommendations on various aspects of informed machine learning analysis, aiming to assist researchers in generating more robust, explainable, and dependable results. Even if we originally crafted these eight simple suggestions for novices, we believe they are deemed relevant for expert computational researchers as well.
机器学习已成为生物医学科学中计算分析的强大工具,通过整合特定领域知识,其有效性显著提高。与缺乏领域知识且平等对待所有变量的研究(无信息机器学习)相比,这种整合催生了有信息机器学习。虽然有信息机器学习在生物信息学和健康信息学数据集上的应用变得更加顺畅,但出错的可能性也增加了。为解决这一缺点,我们提出八项准则,概述在生物医学科学中采用有信息机器学习方法的最佳实践。这些快速提示为有信息机器学习分析的各个方面提供建议,旨在帮助研究人员生成更可靠、可解释和可信的结果。即使我们最初是为新手制定这八条简单建议的,但我们认为它们对专业计算研究人员也同样适用。