• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于病媒预测建模的数据挖掘和机器学习方法:流行病预测建模。

Data mining and machine learning approaches for prediction modelling of disease vectors: Epidemic disease prediction modelling.

作者信息

Fusco Terence, Bi Yaxin, Wang Haiying, Browne Fiona

机构信息

Faculty of Computing and Engineering, University of Ulster, Newtownabbey, UK.

出版信息

Int J Mach Learn Cybern. 2020;11(6):1159-1178. doi: 10.1007/s13042-019-01029-x. Epub 2019 Nov 18.

DOI:10.1007/s13042-019-01029-x
PMID:33727985
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7224118/
Abstract

This research presents viable solutions for prediction modelling of disease based on vector density. Novel training models proposed in this work aim to address various aspects of interest in the artificial intelligence applications domain. Topics discussed include data imputation, semi-supervised labelling and synthetic instance simulation when using sparse training data. Innovative semi-supervised ensemble learning paradigms are proposed focusing on labelling threshold selection and stringency of classification confidence levels. A regression-correlation combination (RCC) data imputation method is also introduced for handling of partially complete training data. Results presented in this work show data imputation precision improvement over benchmark value replacement using proposed RCC on 70% of test cases. Proposed novel incremental transductive models such as ITSVM have provided interesting findings based on threshold constraints outperforming standard SVM application on 21% of test cases and can be applied with alternative environment-based epidemic disease domains. The proposed incremental transductive ensemble approach model enables the combination of complimentary algorithms to provide labelling for unlabelled vector density instances. Liberal (LTA) and strict training approaches provided varied results with LTA outperforming Stacking ensemble on 29.1% of test cases. Proposed novel synthetic minority over-sampling technique (SMOTE) equilibrium approach has yielded subtle classification performance increases which can be further interrogated to assess classification performance and efficiency relationships with synthetic instance generation.

摘要

本研究提出了基于病媒密度进行疾病预测建模的可行解决方案。这项工作中提出的新型训练模型旨在解决人工智能应用领域中各方面的重要问题。讨论的主题包括在使用稀疏训练数据时的数据插补、半监督标记和合成实例模拟。提出了创新的半监督集成学习范式,重点关注标记阈值选择和分类置信水平的严格性。还引入了一种回归-相关组合(RCC)数据插补方法来处理部分完整的训练数据。这项工作中呈现的结果表明,在70%的测试案例中,使用所提出的RCC进行数据插补的精度比基准值替换有所提高。所提出的新型增量转导模型,如ITSVM,基于阈值约束得出了有趣的结果,在21%的测试案例中优于标准支持向量机应用,并且可以应用于基于替代环境的流行病领域。所提出的增量转导集成方法模型能够结合互补算法为未标记的病媒密度实例提供标记。宽松(LTA)和严格训练方法产生了不同的结果,LTA在29.1%的测试案例中优于堆叠集成。所提出的新型合成少数过采样技术(SMOTE)平衡方法已使分类性能有细微提升,可进一步探究以评估分类性能与合成实例生成之间的效率关系。

相似文献

1
Data mining and machine learning approaches for prediction modelling of disease vectors: Epidemic disease prediction modelling.用于病媒预测建模的数据挖掘和机器学习方法:流行病预测建模。
Int J Mach Learn Cybern. 2020;11(6):1159-1178. doi: 10.1007/s13042-019-01029-x. Epub 2019 Nov 18.
2
A hybrid Stacking-SMOTE model for optimizing the prediction of autistic genes.一种混合的堆叠-SMOTE 模型,用于优化自闭症基因预测。
BMC Bioinformatics. 2023 Oct 6;24(1):379. doi: 10.1186/s12859-023-05501-y.
3
Development of an efficient novel method for coronary artery disease prediction using machine learning and deep learning techniques.利用机器学习和深度学习技术开发一种用于冠心病预测的高效新方法。
Technol Health Care. 2024;32(6):4545-4569. doi: 10.3233/THC-240740.
4
Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略:以脑出血为例。
BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.
5
Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。
BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.
6
An efficient ensemble based machine learning approach for predicting Chronic Kidney Disease.一种基于集成的高效机器学习方法用于预测慢性肾脏病。
Curr Med Imaging. 2023 May 8. doi: 10.2174/1573405620666230508104538.
7
Hybrid statistical and machine-learning approach to hearing-loss identification based on an oversampling technique.基于过采样技术的听力损失识别混合统计与机器学习方法。
Comput Biol Med. 2025 Feb;185:109539. doi: 10.1016/j.compbiomed.2024.109539. Epub 2024 Dec 12.
8
LVQ-SMOTE - Learning Vector Quantization based Synthetic Minority Over-sampling Technique for biomedical data.LVQ-SMOTE - 基于学习向量量化的生物医学数据合成少数类过采样技术。
BioData Min. 2013 Oct 2;6(1):16. doi: 10.1186/1756-0381-6-16.
9
Robust diabetic prediction using ensemble machine learning models with synthetic minority over-sampling technique.基于集成机器学习模型和合成少数过采样技术的稳健糖尿病预测。
Sci Rep. 2024 Nov 22;14(1):28984. doi: 10.1038/s41598-024-78519-8.
10
Machine learning-enabled risk prediction of chronic obstructive pulmonary disease with unbalanced data.基于机器学习的慢性阻塞性肺疾病不平衡数据风险预测
Comput Methods Programs Biomed. 2023 Mar;230:107340. doi: 10.1016/j.cmpb.2023.107340. Epub 2023 Jan 6.

引用本文的文献

1
A Proposed Framework for Early Prediction of Schistosomiasis.血吸虫病早期预测的一个提议框架。
Diagnostics (Basel). 2022 Dec 12;12(12):3138. doi: 10.3390/diagnostics12123138.
2
Forecasting COVID19 parameters using time-series: KSA, USA, Spain, and Brazil comparative case study.使用时间序列预测新冠病毒19参数:沙特阿拉伯、美国、西班牙和巴西的比较案例研究。
Heliyon. 2022 Jun;8(6):e09578. doi: 10.1016/j.heliyon.2022.e09578. Epub 2022 Jun 2.
3
Prediction of antischistosomal small molecules using machine learning in the era of big data.基于大数据时代的机器学习预测抗血吸虫小分子药物。
Mol Divers. 2022 Jun;26(3):1597-1607. doi: 10.1007/s11030-021-10288-2. Epub 2021 Aug 5.

本文引用的文献

1
Risk profiling of schistosomiasis using remote sensing: approaches, challenges and outlook.利用遥感技术对血吸虫病进行风险评估:方法、挑战与展望
Parasit Vectors. 2015 Mar 17;8:163. doi: 10.1186/s13071-015-0732-6.
2
Identification of optimum scopes of environmental factors for snails using spatial analysis techniques in Dongting Lake Region, China.利用空间分析技术确定中国洞庭湖地区蜗牛生存环境因素的最佳范围
Parasit Vectors. 2014 May 9;7:216. doi: 10.1186/1756-3305-7-216.
3
Remote sensing and disease control in China: past, present and future.中国的遥感与病虫害防治:过去、现在与未来。
Parasit Vectors. 2013 Jan 11;6:11. doi: 10.1186/1756-3305-6-11.
4
Textual and visual content-based anti-phishing: a Bayesian approach.基于文本和视觉内容的反网络钓鱼:一种贝叶斯方法。
IEEE Trans Neural Netw. 2011 Oct;22(10):1532-46. doi: 10.1109/TNN.2011.2161999. Epub 2011 Aug 4.
5
Remote sensing, geographical information system and spatial analysis for schistosomiasis epidemiology and ecology in Africa.遥感、地理信息系统和空间分析在非洲血吸虫病流行病学和生态学中的应用。
Parasitology. 2009 Nov;136(13):1683-93. doi: 10.1017/S0031182009006222. Epub 2009 Jul 23.
6
Climate change and the distribution and intensity of infectious diseases.气候变化与传染病的分布及强度
Ecology. 2009 Apr;90(4):903-5. doi: 10.1890/08-0659.1.
7
Current status of vaccines for schistosomiasis.血吸虫病疫苗的现状
Clin Microbiol Rev. 2008 Jan;21(1):225-42. doi: 10.1128/CMR.00046-07.
8
Use of thermal and vegetation index data from earth observing satellites to evaluate the risk of schistosomiasis in Bahia, Brazil.利用地球观测卫星的热数据和植被指数数据评估巴西巴伊亚州的血吸虫病风险。
Acta Trop. 2001 Apr 27;79(1):79-85. doi: 10.1016/s0001-706x(01)00105-x.