• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于贝叶斯迁移学习背景下二元分类器设计与评估的合成数据。

Synthetic data for design and evaluation of binary classifiers in the context of Bayesian transfer learning.

作者信息

Maddouri Omar, Qian Xiaoning, Alexander Francis J, Dougherty Edward R, Yoon Byung-Jun

机构信息

Department of Electrical and Computer Engineering, Texas A&M University, College Station TX 77843, USA.

Computational Science Initiative, Brookhaven National Laboratory, Upton NY 11973, USA.

出版信息

Data Brief. 2022 Apr 2;42:108113. doi: 10.1016/j.dib.2022.108113. eCollection 2022 Jun.

DOI:10.1016/j.dib.2022.108113
PMID:35434232
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9011006/
Abstract

Transfer learning (TL) techniques can enable effective learning in data scarce domains by allowing one to re-purpose data or scientific knowledge available in relevant source domains for predictive tasks in a target domain of interest. In this Data in Brief article, we present a synthetic dataset for binary classification in the context of Bayesian transfer learning, which can be used for the design and evaluation of TL-based classifiers. For this purpose, we consider numerous combinations of classification settings, based on which we simulate a diverse set of feature-label distributions with varying learning complexity. For each set of model parameters, we provide a pair of target and source datasets that have been jointly sampled from the underlying feature-label distributions in the target and source domains, respectively. For both target and source domains, the data in a given class and domain are normally distributed, where the distributions across domains are related to each other through a joint prior. To ensure the consistency of the classification complexity across the provided datasets, we have controlled the Bayes error such that it is maintained within a range of predefined values that mimic realistic classification scenarios across different relatedness levels. The provided datasets may serve as useful resources for designing and benchmarking transfer learning schemes for binary classification as well as the estimation of classification error.

摘要

迁移学习(TL)技术能够通过让人们重新利用相关源领域中可用的数据或科学知识,来完成目标领域中感兴趣的预测任务,从而在数据稀缺的领域实现有效学习。在这篇《数据简报》文章中,我们展示了一个用于贝叶斯迁移学习背景下二元分类的合成数据集,该数据集可用于基于迁移学习的分类器的设计与评估。为此,我们考虑了众多分类设置的组合,并在此基础上模拟了具有不同学习复杂度的各种特征 - 标签分布。对于每组模型参数,我们提供一对目标数据集和源数据集,它们分别是从目标域和源域的基础特征 - 标签分布中联合采样得到的。对于目标域和源域,给定类别和域中的数据均呈正态分布,其中跨域分布通过联合先验相互关联。为确保所提供数据集中分类复杂度的一致性,我们控制了贝叶斯误差,使其保持在一系列预定义值的范围内,这些值模拟了不同相关水平下的实际分类场景。所提供的数据集可作为设计和基准测试二元分类迁移学习方案以及估计分类误差的有用资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/f24a88338797/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/018ca4e3e707/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/e4ed7b2bf762/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/139da91bcbd6/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/f24a88338797/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/018ca4e3e707/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/e4ed7b2bf762/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/139da91bcbd6/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ede3/9011006/f24a88338797/gr4.jpg

相似文献

1
Synthetic data for design and evaluation of binary classifiers in the context of Bayesian transfer learning.用于贝叶斯迁移学习背景下二元分类器设计与评估的合成数据。
Data Brief. 2022 Apr 2;42:108113. doi: 10.1016/j.dib.2022.108113. eCollection 2022 Jun.
2
Robust importance sampling for error estimation in the context of optimal Bayesian transfer learning.在最优贝叶斯迁移学习背景下用于误差估计的稳健重要性抽样。
Patterns (N Y). 2022 Jan 25;3(3):100428. doi: 10.1016/j.patter.2021.100428. eCollection 2022 Mar 11.
3
Optimal Bayesian Transfer Learning for Count Data.最优贝叶斯迁移学习在计数数据中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):644-655. doi: 10.1109/TCBB.2019.2920981. Epub 2021 Apr 8.
4
Regularized Bayesian transfer learning for population-level etiological distributions.基于正则化贝叶斯迁移学习的人群病因分布研究。
Biostatistics. 2021 Oct 13;22(4):836-857. doi: 10.1093/biostatistics/kxaa001.
5
Multi-Source Deep Transfer Neural Network Algorithm.多源深度迁移神经网络算法。
Sensors (Basel). 2019 Sep 16;19(18):3992. doi: 10.3390/s19183992.
6
A brain-like classification method for computed tomography images based on adaptive feature matching dual-source domain heterogeneous transfer learning.一种基于自适应特征匹配双源域异构迁移学习的计算机断层扫描图像脑状分类方法。
Front Hum Neurosci. 2022 Oct 11;16:1019564. doi: 10.3389/fnhum.2022.1019564. eCollection 2022.
7
Seizure Classification From EEG Signals Using an Online Selective Transfer TSK Fuzzy Classifier With Joint Distribution Adaption and Manifold Regularization.使用具有联合分布自适应和流形正则化的在线选择性转移TSK模糊分类器从脑电图信号中进行癫痫发作分类
Front Neurosci. 2020 Jun 11;14:496. doi: 10.3389/fnins.2020.00496. eCollection 2020.
8
A Bayesian network classification methodology for gene expression data.一种用于基因表达数据的贝叶斯网络分类方法。
J Comput Biol. 2004;11(4):581-615. doi: 10.1089/cmb.2004.11.581.
9
Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.非靶向代谢组学中多类表型鉴别分类器性能评估
Metabolites. 2017 Jun 21;7(2):30. doi: 10.3390/metabo7020030.
10
A transfer learning model with multi-source domains for biomedical event trigger extraction.一种用于生物医学事件触发词提取的多源域迁移学习模型。
BMC Genomics. 2021 Jan 7;22(1):31. doi: 10.1186/s12864-020-07315-1.

本文引用的文献

1
Robust importance sampling for error estimation in the context of optimal Bayesian transfer learning.在最优贝叶斯迁移学习背景下用于误差估计的稳健重要性抽样。
Patterns (N Y). 2022 Jan 25;3(3):100428. doi: 10.1016/j.patter.2021.100428. eCollection 2022 Mar 11.