• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于自对比学习的文本释义生成研究

Research of text paraphrase generation based on self-contrastive learning.

作者信息

Yuan Ling, Yu Hai Ping, Ren Junlin, Sun Ping

机构信息

School of Computing Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, People's Republic of China.

Wuhan Vocational College of Software and Engineering (Wuhan Open University), Wuhan, Hubei, People's Republic of China.

出版信息

PLoS One. 2025 Sep 2;20(9):e0327613. doi: 10.1371/journal.pone.0327613. eCollection 2025.

DOI:10.1371/journal.pone.0327613
PMID:40892948
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12404545/
Abstract

The goal of this study is to improve the quality and diversity of text paraphrase generation, a critical task in Natural Language Generation (NLG) that requires producing semantically equivalent sentences with varied structures and expressions. Existing approaches often fail to generate paraphrases that are both high-quality and diverse, limiting their applicability in tasks such as machine translation, dialogue systems, and automated content rewriting. To address this gap, we introduce two self-contrastive learning models designed to enhance paraphrase generation: the Contrastive Generative Adversarial Network (ContraGAN) for supervised learning and the Contrastive Model with Metrics (ContraMetrics) for unsupervised learning. ContraGAN leverages a learnable discriminator within an adversarial framework to refine the quality of generated paraphrases, while ContraMetrics incorporates multi-metric filtering and keyword-guided prompts to improve unsupervised generation diversity. Experiments on benchmark datasets demonstrate that both models achieve significant improvements over state-of-the-art methods. ContraGAN enhances semantic fidelity with a 0.46 gain in BERTScore and improves fluency with a 1.57 reduction in perplexity. In addition, ContraMetrics achieves gains of 0.37 and 3.34 in iBLEU and P-BLEU, respectively, reflecting greater diversity and lexical richness. These results validate the effectiveness of our models in addressing key challenges in paraphrase generation, offering practical solutions for diverse NLG applications.

摘要

本研究的目标是提高文本释义生成的质量和多样性,这是自然语言生成(NLG)中的一项关键任务,需要生成结构和表达各异但语义等效的句子。现有方法往往无法生成高质量且多样的释义,限制了它们在机器翻译、对话系统和自动内容重写等任务中的适用性。为了弥补这一差距,我们引入了两种旨在增强释义生成的自对比学习模型:用于监督学习的对比生成对抗网络(ContraGAN)和用于无监督学习的带度量的对比模型(ContraMetrics)。ContraGAN在对抗框架内利用可学习的判别器来优化生成释义的质量,而ContraMetrics结合了多度量过滤和关键词引导的提示来提高无监督生成的多样性。在基准数据集上的实验表明,这两种模型都比现有方法有显著改进。ContraGAN在BERTScore上提高了0.46,增强了语义保真度,并在困惑度上降低了1.57,提高了流畅性。此外,ContraMetrics在iBLEU和P-BLEU上分别提高了0.37和3.34,反映出更高的多样性和词汇丰富度。这些结果验证了我们的模型在解决释义生成中的关键挑战方面的有效性,为各种NLG应用提供了切实可行的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/b8d4bd7842ed/pone.0327613.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/27860d92f6ab/pone.0327613.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/9678937b1a66/pone.0327613.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/0ca0d8042c05/pone.0327613.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/3794b15f11f9/pone.0327613.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/0ddd4ba32f46/pone.0327613.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/65921be886b3/pone.0327613.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/4e7085d74719/pone.0327613.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/80f97e37a4f1/pone.0327613.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/83d56382f6d2/pone.0327613.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/b8d4bd7842ed/pone.0327613.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/27860d92f6ab/pone.0327613.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/9678937b1a66/pone.0327613.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/0ca0d8042c05/pone.0327613.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/3794b15f11f9/pone.0327613.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/0ddd4ba32f46/pone.0327613.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/65921be886b3/pone.0327613.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/4e7085d74719/pone.0327613.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/80f97e37a4f1/pone.0327613.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/83d56382f6d2/pone.0327613.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b5d/12404545/b8d4bd7842ed/pone.0327613.g010.jpg

相似文献

1
Research of text paraphrase generation based on self-contrastive learning.基于自对比学习的文本释义生成研究
PLoS One. 2025 Sep 2;20(9):e0327613. doi: 10.1371/journal.pone.0327613. eCollection 2025.
2
Radiology report generation using automatic keyword adaptation, frequency-based multi-label classification and text-to-text large language models.使用自动关键词适配、基于频率的多标签分类和文本到文本的大语言模型生成放射学报告。
Comput Biol Med. 2025 Jul 3;196(Pt A):110625. doi: 10.1016/j.compbiomed.2025.110625.
3
Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力:结合混合方法和多模态分析的综合框架
JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.
4
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
5
A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。
Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.
6
Noise-aware system generative model (NASGM): positron emission tomography (PET) image simulation framework with observer validation studies.噪声感知系统生成模型(NASGM):用于正电子发射断层扫描(PET)图像模拟框架及观察者验证研究。
Med Phys. 2025 Jul;52(7):e17962. doi: 10.1002/mp.17962.
7
Short-Term Memory Impairment短期记忆障碍
8
Cognitive decline assessment using semantic linguistic content and transformer deep learning architecture.使用语义语言内容和变压器深度学习架构评估认知能力下降。
Int J Lang Commun Disord. 2024 May-Jun;59(3):1110-1127. doi: 10.1111/1460-6984.12973. Epub 2023 Nov 16.
9
A semi supervised framework for human and machine collaboration in computer assisted text refinement.一种用于计算机辅助文本优化中人与机器协作的半监督框架。
Sci Rep. 2025 Jul 7;15(1):24312. doi: 10.1038/s41598-025-10085-z.
10
Medical semantic knowledge-integrated multitask learning network for report generation and neoadjuvant chemotherapy response prediction.用于报告生成和新辅助化疗反应预测的医学语义知识集成多任务学习网络
Med Phys. 2025 Jul;52(7):e17925. doi: 10.1002/mp.17925.

本文引用的文献

1
Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources.多元中的统一:跨多模态医学资源的协作式预训练
Proc Conf Assoc Comput Linguist Meet. 2024 Aug;2024(Volume 1 Long Papers):3644-3656. doi: 10.18653/v1/2024.acl-long.199.
2
stVAE deconvolves cell-type composition in large-scale cellular resolution spatial transcriptomics.stVAE 可对大规模细胞分辨率空间转录组学中的细胞类型组成进行去卷积。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad642.
3
A Style-Based Generator Architecture for Generative Adversarial Networks.
基于风格的生成对抗网络生成器架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4217-4228. doi: 10.1109/TPAMI.2020.2970919. Epub 2021 Nov 3.
4
EEGNet: a compact convolutional neural network for EEG-based brain-computer interfaces.EEGNet:一种基于 EEG 的脑机接口用的紧凑卷积神经网络。
J Neural Eng. 2018 Oct;15(5):056013. doi: 10.1088/1741-2552/aace8c. Epub 2018 Jun 22.
5
In defense of abstract conceptual representations.为抽象概念表征辩护。
Psychon Bull Rev. 2016 Aug;23(4):1096-108. doi: 10.3758/s13423-015-0909-1.