Jiang Shufan, Cormier Stéphane, Angarita Rafael, Rousseaux Francis
CReSTIC, Université de Reims Champagne Ardenne, Reims, France.
LISITE, Institut Supérieur d'Electronique de Paris, Paris, France.
Front Artif Intell. 2023 Feb 21;6:1072329. doi: 10.3389/frai.2023.1072329. eCollection 2023.
The Bidirectional Encoder Representations from Transformers (BERT) architecture offers a cutting-edge approach to Natural Language Processing. It involves two steps: 1) pre-training a language model to extract contextualized features and 2) fine-tuning for specific downstream tasks. Although pre-trained language models (PLMs) have been successful in various text-mining applications, challenges remain, particularly in areas with limited labeled data such as plant health hazard detection from individuals' observations. To address this challenge, we propose to combine GAN-BERT, a model that extends the fine-tuning process with unlabeled data through a Generative Adversarial Network (GAN), with ChouBERT, a domain-specific PLM. Our results show that GAN-BERT outperforms traditional fine-tuning in multiple text classification tasks. In this paper, we examine the impact of further pre-training on the GAN-BERT model. We experiment with different hyper parameters to determine the best combination of models and fine-tuning parameters. Our findings suggest that the combination of GAN and ChouBERT can enhance the generalizability of the text classifier but may also lead to increased instability during training. Finally, we provide recommendations to mitigate these instabilities.
来自变换器的双向编码器表征(BERT)架构为自然语言处理提供了一种前沿方法。它包括两个步骤:1)预训练一个语言模型以提取上下文特征,以及2)针对特定的下游任务进行微调。尽管预训练语言模型(PLM)在各种文本挖掘应用中取得了成功,但挑战依然存在,尤其是在标记数据有限的领域,例如从个人观察中检测植物健康危害。为应对这一挑战,我们建议将GAN-BERT(一种通过生成对抗网络(GAN)用未标记数据扩展微调过程的模型)与特定领域的PLM——ChouBERT相结合。我们的结果表明,GAN-BERT在多个文本分类任务中优于传统微调。在本文中,我们研究了进一步预训练对GAN-BERT模型的影响。我们对不同的超参数进行实验,以确定模型和微调参数的最佳组合。我们的研究结果表明,GAN和ChouBERT的结合可以提高文本分类器的泛化能力,但也可能导致训练过程中不稳定性增加。最后,我们提供了减轻这些不稳定性的建议。