Corcoran Caroline, Bennett Rachel, Miller Vandana, Krebs Fred, Dampier Will
Laboratory for Translational Plasma Research, Department of Microbiology and Immunology, Drexel College of Medicine, Philadelphia, PA 19102 USA.
Department of Biology, Drexel University, Philadelphia, PA 19102 USA.
IEEE Trans Radiat Plasma Med Sci. 2025 Mar;9(3):388-394. doi: 10.1109/trpms.2024.3447551. Epub 2024 Aug 21.
Non-thermal plasma, cold plasma, and atmospheric-pressure plasma are few terms used to describe the plasma used in plasma medicine research. The resulting ambiguity hampers literature searches, confuses discussion, and complicates collaborations. To assess the full breadth of this problem, we designed a natural language processing model (NLP) that surveyed approximately 15,000 papers in response to the query "plasma medicine" indexed in PubMed between 2020-2022. Our NLP was constructed and executed using the Hugging Face transformers API and PubMed BERT pretrained model. We used this model to determine the prevalence and to assess the utility of each term for searching literature relevant to plasma medicine. The effectiveness of each term was measured by precision, the ability to discriminate relevant and irrelevant literature; and recall, the ability to retrieve relevant literature. Each term was given a combined effectiveness score of 0-1 (1 = ideal effectiveness) accounting for precision, recall, sample size, and model confidence. Our model showed that of the twelve commonly used terms analyzed, none received a combined effectiveness score over 0.025. We concluded that there is no universal term for "plasma" that provides a satisfactory representation of literature. These results highlight the need for standardization of nomenclature in plasma medicine.
非热等离子体、冷等离子体和大气压等离子体是用于描述等离子体医学研究中所使用的等离子体的几个术语。由此产生的模糊性妨碍了文献检索、使讨论变得混乱,并使合作变得复杂。为了评估这一问题的全貌,我们设计了一种自然语言处理模型(NLP),该模型对2020年至2022年期间在PubMed中索引为“等离子体医学”的约15000篇论文进行了调查。我们的NLP是使用Hugging Face变压器应用程序编程接口(API)和PubMed BERT预训练模型构建并执行的。我们使用这个模型来确定每个术语的流行程度,并评估其在搜索与等离子体医学相关文献时的效用。每个术语的有效性通过精确率(区分相关和不相关文献的能力)和召回率(检索相关文献的能力)来衡量。每个术语都被赋予了一个0到1的综合有效性分数(1表示理想有效性),该分数考虑了精确率、召回率、样本量和模型置信度。我们的模型表明,在所分析的十二个常用术语中,没有一个的综合有效性分数超过0.025。我们得出结论,对于“等离子体”,没有一个通用术语能够令人满意地代表文献。这些结果凸显了等离子体医学中术语标准化的必要性。