Suppr超能文献

基于随机森林的迁移学习方法及其在代表性不足人群乳腺癌预测中的应用。

A transfer learning approach based on random forest with application to breast cancer prediction in underrepresented populations.

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

出版信息

Pac Symp Biocomput. 2023;28:186-197.

Abstract

Despite the high-quality, data-rich samples collected by recent large-scale biobanks, the underrepresentation of participants from minority and disadvantaged groups has limited the use of biobank data for developing disease risk prediction models that can be generalized to diverse populations, which may exacerbate existing health disparities. This study addresses this critical challenge by proposing a transfer learning framework based on random forest models (TransRF). TransRF can incorporate risk prediction models trained in a source population to improve the prediction performance in a target underrepresented population with limited sample size. TransRF is based on an ensemble of multiple transfer learning approaches, each covering a particular type of similarity between the source and the target populations, which is shown to be robust and applicable in a broad spectrum of scenarios. Using extensive simulation studies, we demonstrate the superior performance of TransRF compared with several benchmark approaches across different data generating mechanisms. We illustrate the feasibility of TransRF by applying it to build breast cancer risk assessment models for African-ancestry women and South Asian women, respectively, with UK biobank data.

摘要

尽管最近的大型生物库收集了高质量、数据丰富的样本,但少数群体和弱势群体参与者代表性不足,限制了生物库数据在开发能够推广到不同人群的疾病风险预测模型中的使用,这可能会加剧现有的健康差距。本研究通过提出基于随机森林模型(TransRF)的迁移学习框架来解决这一关键挑战。TransRF 可以将在源人群中训练的风险预测模型纳入其中,以提高在目标人群中代表性不足且样本量有限的情况下的预测性能。TransRF 基于多个迁移学习方法的集成,每个方法都涵盖了源人群和目标人群之间的特定相似性类型,该方法在广泛的场景中被证明是稳健和适用的。通过广泛的模拟研究,我们证明了 TransRF 在不同数据生成机制下与几个基准方法相比具有优越的性能。我们通过应用 TransRF 分别为具有英国生物库数据的非洲裔女性和南亚女性构建乳腺癌风险评估模型来说明其可行性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验