• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于众筹活动成功预测的BERT与FastText表征的对比分析

Comparative analysis of BERT and FastText representations on crowdfunding campaign success prediction.

作者信息

Gunduz Hakan

机构信息

Software Engineering Department, Kocaeli University, Kocaeli, Marmara, Turkey.

出版信息

PeerJ Comput Sci. 2024 Sep 11;10:e2316. doi: 10.7717/peerj-cs.2316. eCollection 2024.

DOI:10.7717/peerj-cs.2316
PMID:39314718
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11419673/
Abstract

Crowdfunding has become a popular financing method, attracting investors, businesses, and entrepreneurs. However, many campaigns fail to secure funding, making it crucial to reduce participation risks using artificial intelligence (AI). This study investigates the effectiveness of advanced AI techniques in predicting the success of crowdfunding campaigns on Kickstarter by analyzing campaign blurbs. We compare the performance of two widely used text representation models, bidirectional encoder representations from transformers (BERT) and FastText, in conjunction with long-short term memory (LSTM) and gradient boosting machine (GBM) classifiers. Our analysis involves preprocessing campaign blurbs, extracting features using BERT and FastText, and evaluating the predictive performance of these features with LSTM and GBM models. All experimental results show that BERT representations significantly outperform FastText, with the highest accuracy of 0.745 achieved using a fine-tuned BERT model combined with LSTM. These findings highlight the importance of using deep contextual embeddings and the benefits of fine-tuning pre-trained models for domain-specific applications. The results are benchmarked against existing methods, demonstrating the superiority of our approach. This study provides valuable insights for improving predictive models in the crowdfunding domain, offering practical implications for campaign creators and investors.

摘要

众筹已成为一种流行的融资方式,吸引着投资者、企业和企业家。然而,许多众筹活动未能获得资金,因此利用人工智能(AI)降低参与风险至关重要。本研究通过分析活动简介,调查先进的人工智能技术在预测Kickstarter众筹活动成功方面的有效性。我们比较了两种广泛使用的文本表示模型——来自变换器的双向编码器表示(BERT)和FastText——与长短期记忆(LSTM)和梯度提升机(GBM)分类器的性能。我们的分析包括预处理活动简介、使用BERT和FastText提取特征,以及用LSTM和GBM模型评估这些特征的预测性能。所有实验结果表明,BERT表示显著优于FastText,使用微调后的BERT模型与LSTM相结合达到了最高0.745的准确率。这些发现凸显了使用深度上下文嵌入的重要性以及为特定领域应用微调预训练模型的好处。结果与现有方法进行了基准比较,证明了我们方法的优越性。本研究为改进众筹领域的预测模型提供了有价值的见解,为活动创建者和投资者提供了实际意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/4cac276fc9af/peerj-cs-10-2316-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/ffba27967cb2/peerj-cs-10-2316-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/18e754a6c549/peerj-cs-10-2316-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/cbac96d474b0/peerj-cs-10-2316-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/416a94b44ecd/peerj-cs-10-2316-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/81ec21084324/peerj-cs-10-2316-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/43faf8ac75d0/peerj-cs-10-2316-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/f88b3c04d210/peerj-cs-10-2316-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/4cac276fc9af/peerj-cs-10-2316-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/ffba27967cb2/peerj-cs-10-2316-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/18e754a6c549/peerj-cs-10-2316-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/cbac96d474b0/peerj-cs-10-2316-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/416a94b44ecd/peerj-cs-10-2316-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/81ec21084324/peerj-cs-10-2316-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/43faf8ac75d0/peerj-cs-10-2316-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/f88b3c04d210/peerj-cs-10-2316-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/65a5/11419673/4cac276fc9af/peerj-cs-10-2316-g008.jpg

相似文献

1
Comparative analysis of BERT and FastText representations on crowdfunding campaign success prediction.基于众筹活动成功预测的BERT与FastText表征的对比分析
PeerJ Comput Sci. 2024 Sep 11;10:e2316. doi: 10.7717/peerj-cs.2316. eCollection 2024.
2
Fine-Tuning Large Language Models to Enhance Programmatic Assessment in Graduate Medical Education.微调大语言模型以加强毕业后医学教育中的程序化评估。
J Educ Perioper Med. 2024 Sep 30;26(3):E729. doi: 10.46374/VolXXVI_Issue3_Moore. eCollection 2024 Jul-Sep.
3
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
4
A deep learning approach in predicting products' sentiment ratings: a comparative analysis.一种用于预测产品情感评分的深度学习方法:比较分析。
J Supercomput. 2022;78(5):7206-7226. doi: 10.1007/s11227-021-04169-6. Epub 2021 Nov 5.
5
BERT-based Ranking for Biomedical Entity Normalization.基于BERT的生物医学实体规范化排序
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.
6
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
7
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.医学BERT:基于大规模结构化电子健康记录进行疾病预测的预训练上下文嵌入模型
NPJ Digit Med. 2021 May 20;4(1):86. doi: 10.1038/s41746-021-00455-y.
8
Modified Bidirectional Encoder Representations From Transformers Extractive Summarization Model for Hospital Information Systems Based on Character-Level Tokens (AlphaBERT): Development and Performance Evaluation.基于字符级令牌的医院信息系统变压器抽取式摘要模型(AlphaBERT)的改进双向编码器表示:开发与性能评估
JMIR Med Inform. 2020 Apr 29;8(4):e17787. doi: 10.2196/17787.
9
Multi-class sentiment analysis of urdu text using multilingual BERT.使用多语言 BERT 进行乌尔都语文本的多类情感分析。
Sci Rep. 2022 Mar 31;12(1):5436. doi: 10.1038/s41598-022-09381-9.
10
A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging.一种用于提取肺癌筛查和分期用中文临床实体的新型深度学习方法。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):214. doi: 10.1186/s12911-021-01575-x.

本文引用的文献

1
AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks.AIPs-SnTCN:使用基于fastText和基于Transformer编码器的混合词嵌入与自归一化时间卷积网络预测抗炎肽
J Chem Inf Model. 2023 Nov 13;63(21):6537-6554. doi: 10.1021/acs.jcim.3c01563. Epub 2023 Oct 31.
2
Malware detection framework based on graph variational autoencoder extracted embeddings from API-call graphs.基于图变分自编码器的恶意软件检测框架从应用程序编程接口调用图中提取嵌入。
PeerJ Comput Sci. 2022 May 18;8:e988. doi: 10.7717/peerj-cs.988. eCollection 2022.
3
Gradient boosting machines, a tutorial.
梯度提升机,教程。
Front Neurorobot. 2013 Dec 4;7:21. doi: 10.3389/fnbot.2013.00021. eCollection 2013.