• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于财务困境识别的特征选择与特征提取方法的比较研究。

A comparative study of feature selection and feature extraction methods for financial distress identification.

作者信息

Kuizinienė Dovilė, Savickas Paulius, Kunickaitė Rimantė, Juozaitienė Rūta, Damaševičius Robertas, Maskeliūnas Rytis, Krilavičius Tomas

机构信息

Department of Applied Informatics, Vytautas Magnus University, Kaunas, Lithuania.

Silesian University of Technology, Gliwice, Poland.

出版信息

PeerJ Comput Sci. 2024 Apr 30;10:e1956. doi: 10.7717/peerj-cs.1956. eCollection 2024.

DOI:10.7717/peerj-cs.1956
PMID:38855232
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11157601/
Abstract

Financial distress identification remains an essential topic in the scientific literature due to its importance for society and the economy. The advancements in information technology and the escalating volume of stored data have led to the emergence of financial distress that transcends the realm of financial statements and its' indicators (ratios). The feature space could be expanded by incorporating new perspectives on feature data categories such as macroeconomics, sectors, social, board, management, judicial incident, . However, the increased dimensionality results in sparse data and overfitted models. This study proposes a new approach for efficient financial distress classification assessment by combining dimensionality reduction and machine learning techniques. The proposed framework aims to identify a subset of features leading to the minimization of the loss function describing the financial distress in an enterprise. During the study, 15 dimensionality reduction techniques with different numbers of features and 17 machine-learning models were compared. Overall, 1,432 experiments were performed using Lithuanian enterprise data covering the period from 2015 to 2022. Results revealed that the artificial neural network (ANN) model with 30 ranked features identified using the Random Forest mean decreasing Gini (RF_MDG) feature selection technique provided the highest AUC score. Moreover, this study has introduced a novel approach for feature extraction, which could improve financial distress classification models.

摘要

由于财务困境识别对社会和经济具有重要意义,因此它仍然是科学文献中的一个重要主题。信息技术的进步和存储数据量的不断增加,导致了超越财务报表及其指标(比率)范围的财务困境的出现。通过纳入对宏观经济、行业、社会、董事会、管理层、司法事件等特征数据类别的新视角,可以扩展特征空间。然而,维度的增加导致数据稀疏和模型过度拟合。本研究提出了一种结合降维和机器学习技术的高效财务困境分类评估新方法。所提出的框架旨在识别导致企业财务困境损失函数最小化的特征子集。在研究过程中,比较了15种具有不同特征数量的降维技术和17种机器学习模型。总体而言,使用2015年至2022年期间的立陶宛企业数据进行了1432次实验。结果表明,使用随机森林平均基尼系数下降(RF_MDG)特征选择技术识别出的具有30个排名特征的人工神经网络(ANN)模型提供了最高的AUC分数。此外,本研究还引入了一种新颖的特征提取方法,该方法可以改进财务困境分类模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/bb660b72673d/peerj-cs-10-1956-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/cd4008a35214/peerj-cs-10-1956-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/aaa7a9b704ba/peerj-cs-10-1956-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/9404e2758a8b/peerj-cs-10-1956-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/a44cf1a98027/peerj-cs-10-1956-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/d9b79f25e12d/peerj-cs-10-1956-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/e08af276d61c/peerj-cs-10-1956-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/966a156fe7b9/peerj-cs-10-1956-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/bb660b72673d/peerj-cs-10-1956-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/cd4008a35214/peerj-cs-10-1956-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/aaa7a9b704ba/peerj-cs-10-1956-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/9404e2758a8b/peerj-cs-10-1956-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/a44cf1a98027/peerj-cs-10-1956-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/d9b79f25e12d/peerj-cs-10-1956-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/e08af276d61c/peerj-cs-10-1956-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/966a156fe7b9/peerj-cs-10-1956-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0064/11157601/bb660b72673d/peerj-cs-10-1956-g008.jpg

相似文献

1
A comparative study of feature selection and feature extraction methods for financial distress identification.用于财务困境识别的特征选择与特征提取方法的比较研究。
PeerJ Comput Sci. 2024 Apr 30;10:e1956. doi: 10.7717/peerj-cs.1956. eCollection 2024.
2
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
3
Research on Supply Chain Financial Risk Prevention Based on Machine Learning.基于机器学习的供应链金融风险防范研究。
Comput Intell Neurosci. 2023 Mar 6;2023:6531154. doi: 10.1155/2023/6531154. eCollection 2023.
4
A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data.随机森林及其基尼重要性与标准化学计量学方法在光谱数据特征选择和分类方面的比较。
BMC Bioinformatics. 2009 Jul 10;10:213. doi: 10.1186/1471-2105-10-213.
5
Drug-Protein Interactions Prediction Models Using Feature Selection and Classification Techniques.基于特征选择和分类技术的药物-蛋白相互作用预测模型。
Curr Drug Metab. 2023;24(12):817-834. doi: 10.2174/0113892002268739231211063718.
6
CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter.CoAID-DEEP:用于自动检测推特上新冠病毒误导性信息的优化智能框架
IEEE Access. 2021 Feb 9;9:27840-27867. doi: 10.1109/ACCESS.2021.3058066. eCollection 2021.
7
Machine Learning and Feature Selection for soil spectroscopy. An evaluation of Random Forest wrappers to predict soil organic matter, clay, and carbonates.用于土壤光谱学的机器学习与特征选择。对随机森林包装器预测土壤有机质、黏土和碳酸盐的评估。
Heliyon. 2024 Apr 25;10(9):e30228. doi: 10.1016/j.heliyon.2024.e30228. eCollection 2024 May 15.
8
Use of radiomics based on F-FDG PET/CT and machine learning methods to aid clinical decision-making in the classification of solitary pulmonary lesions: an innovative approach.基于 F-FDG PET/CT 和机器学习方法的影像组学在孤立性肺病变分类中辅助临床决策:一种创新方法。
Eur J Nucl Med Mol Imaging. 2021 Aug;48(9):2904-2913. doi: 10.1007/s00259-021-05220-7. Epub 2021 Feb 5.
9
Machine-learning models for activity class prediction: A comparative study of feature selection and classification algorithms.机器学习模型在活动分类预测中的应用:特征选择与分类算法的对比研究。
Gait Posture. 2021 Sep;89:45-53. doi: 10.1016/j.gaitpost.2021.06.017. Epub 2021 Jun 24.
10
A Framework for Detecting Thyroid Cancer from Ultrasound and Histopathological Images Using Deep Learning, Meta-Heuristics, and MCDM Algorithms.一种使用深度学习、元启发式算法和多准则决策算法从超声和组织病理学图像中检测甲状腺癌的框架。
J Imaging. 2023 Aug 27;9(9):173. doi: 10.3390/jimaging9090173.

本文引用的文献

1
DNN-DTIs: Improved drug-target interactions prediction using XGBoost feature selection and deep neural network.基于 XGBoost 特征选择和深度神经网络的 DNN-DTIs:提高药物-靶标相互作用预测。
Comput Biol Med. 2021 Sep;136:104676. doi: 10.1016/j.compbiomed.2021.104676. Epub 2021 Jul 29.
2
Correlation and association analyses in microbiome study integrating multiomics in health and disease.在健康和疾病的多组学整合微生物组研究中进行相关性和关联性分析。
Prog Mol Biol Transl Sci. 2020;171:309-491. doi: 10.1016/bs.pmbts.2020.04.003. Epub 2020 May 23.
3
Relief-based feature selection: Introduction and review.
基于缓解的特征选择:介绍与综述。
J Biomed Inform. 2018 Sep;85:189-203. doi: 10.1016/j.jbi.2018.07.014. Epub 2018 Jul 18.
4
A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress.基于基因表达式编程的季节性时间序列模型在财务困境预测中的应用。
Comput Intell Neurosci. 2018 Mar 22;2018:1067350. doi: 10.1155/2018/1067350. eCollection 2018.
5
A Global Model for Bankruptcy Prediction.一种破产预测的全球模型。
PLoS One. 2016 Nov 23;11(11):e0166693. doi: 10.1371/journal.pone.0166693. eCollection 2016.
6
An experimental study of the intrinsic stability of random forest variable importance measures.随机森林变量重要性度量内在稳定性的实验研究
BMC Bioinformatics. 2016 Feb 3;17:60. doi: 10.1186/s12859-016-0900-5.