Du Yongping, Yan Jingya, Lu Yuxuan, Zhao Yiliang, Jin Xingnan
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1114-1124. doi: 10.1109/TCBB.2022.3171388. Epub 2023 Apr 3.
Biomedical Question Answering aims to extract an answer to the given question from a biomedical context. Due to the strong professionalism of specific domain, it's more difficult to build large-scale datasets for specific domain question answering. Existing methods are limited by the lack of training data, and the performance is not as good as in open-domain settings, especially degrading when facing to the adversarial sample. We try to resolve the above issues. First, effective data augmentation strategies are adopted to improve the model training, including slide window, summarization and round-trip translation. Second, we propose a model weighting strategy for the final answer prediction in biomedical domain, which combines the advantage of two models, open-domain model QANet and BioBERT pre-trained in biomedical domain data. Finally, we give adversarial training to reinforce the robustness of the model. The public biomedical dataset collected from PubMed provided by BioASQ challenge is used to evaluate our approach. The results show that the model performance has been improved significantly compared to the single model and other models participated in BioASQ challenge. It can learn richer semantic expression from data augmentation and adversarial samples, which is beneficial to solve more complex question answering problems in biomedical domain.
IEEE/ACM Trans Comput Biol Bioinform. 2023
BMC Bioinformatics. 2015-4-30
Methods Inf Med. 2017-5-18
J Biomed Inform. 2019-2-10
J Biomed Inform. 2019-3-12
BMC Bioinformatics. 2021-5-26
IEEE/ACM Trans Comput Biol Bioinform. 2022
IEEE/ACM Trans Comput Biol Bioinform. 2023
Proc SIGCHI Conf Hum Factor Comput Syst. 2024-5
J Am Med Inform Assoc. 2024-4-3