nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation

verfasst von : Zhijie Qu, Juan Li, Zerui Ma, Jianqiang Li

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Medical dialogue generation relies on natural language generation techniques to enable online medical consultations. Recently, the widespread adoption of large-scale models in the field of natural language processing has facilitated rapid advancements in this technology. Existing medical dialogue models are mostly based on BERT and pre-trained on English corpora, but there is a lack of high-performing models on the task of Chinese medical dialogue generation. To solve the above problem, this paper proposes CMed-GPT, which is the GPT pre-training language model based on Chinese medical domain text. The model is available in two versions, namely, base and large, with corresponding perplexity values of 8.64 and 8.01. Additionally, we incorporate lexical and entity embeddings into the dialogue text in a uniform manner to meet the requirements of downstream dialogue generation tasks. By applying both fine-tuning and p-tuning to CMed-GPT, we lowered the PPL from 8.44 to 7.35. This study not only confirms the exceptional performance of the CMed-GPT model in generating Chinese biomedical text but also highlights the advantages of p-tuning over traditional fine-tuning with prefix prompts. Furthermore, we validate the significance of incorporating external information in medical dialogue generation, which enhances the quality of dialogue generation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Spatial-Temporal Bipartite Graph Attention Network for Traffic Forecasting

Nächstes Kapitel MvRNA: A New Multi-view Deep Neural Network for Predicting Parkinson’s Disease

https://github.com/Morizeyao/GPT2-Chinese.

https://huggingface.co/bert-base-chinese.

https://github.com/Morizeyao/GPT2-Chinese.

https://github.com/trueto/medbert.

https://arxiv.org/abs/2212.06049.

He, X., et al.: MedDialog: Two large-scale medical dialogue datasets (2020). arXiv preprint arXiv:2004.03329

Liu, W., Tang, J., Qin, J., Xu, L., Liang, X.: MedDG: A large-scale medical consultation dataset for building medical dialogue system (2020). arXiv preprint arXiv:2010.07497

Li, D., et al.: Semi-supervised variational reasoning for medical dialogue generation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 544–554. Association for Computing Machinery, New York (2021)

Wei, Z., et al.: Task-oriented dialogue system for automatic Diagnosis. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 201–207. Association for Computational Linguistics, Melbourne (2018)

Xu, L., Zhou Q., Gong, K., Liang, X., Tang, J., Lin, L.: End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7346–7353. Association for the Advancement of Artificial Intelligence (2019)

Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics, Brussels (2018)

Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp.58–65. Association for Computational Linguistics, Florence (2019)

Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3(1), 1–23 (2022)CrossRef

Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. J. Leukoc. Biol. 36(4), 1234–1240 (2020)MathSciNet

10.

Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp.3615–3620. Association for Computational Linguistics, Hong Kong (2019)

11.

Roitero, K., et al.: DiLBERT: cheap embeddings for disease related medical NLP. IEEE Access 9(9), 2169–3536 (2021)

12.

Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach (2019). arXiv preprint arXiv:1907.11692

13.

Zhang, N., Jia, Q., Yin, K., Dong, L., Gao, F., Hua, N.: Conceptualized Representation Learning for Chinese Biomedical Text Mining (2020). arXiv preprint arXiv:2008.10813

14.

Zhang, T., Cai, Z., Wang, C., Qiu, M., Yang, B., He, X.: SMedBERT: a knowledge-enhanced pre-trained language model with structured semantics for medical text mining. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.5882–5893. Association for Computational Linguistics (2021)

15.

He, B., et al.: BERT-MK: integrating graph contextualized knowledge into pre-trained language models. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp.2281–2290. Association for Computational Linguistics (2020)

16.

Radford, A., et al.: Language models are unsupervised multitask learners. GPT-2 OpenAI blog (2019)

17.

Brown, T.B., et al.: Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, pp.1877–1901. Curran Associates Inc, Red Hook (2020)

18.

Papanikolaou, Y., Pierleoni, A.: DARE: Data augmented relation extraction with GPT-2 (2020). arXiv preprint arXiv:2004.13845

19.

Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp.6000–6010. Curran Associates Inc., Red Hook (2017)

20.

Loshchilov, I., Hutter, H.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2017)

21.

Peng, X., et al.: Fine-Tuning a transformer-based language model to avoid generating non-normative text (2020). arXiv preprint arXiv:2001.08764v1

22.

Davier, M.V., Training optimus prime, M.D.: Generating medical certification items by Fine-Tuning OpenAI’s gpt2 transformer model (2019). arXiv preprint arXiv:1908.08594

23.

Tsai, D.C.L., et al.: Short answer questions generation by Fine-Tuning BERT and GPT-2. In: 29th International Conference on Computers in Education Conference, pp. 509–515. Asia-Pacific Society for Computers in Education (2021)

24.

Li, X., Liang, P.: Prefix-Tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.4582–4597. Association for Computational Linguistics (2021)

25.

Lester, B., et al.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.3045–3059. Association for Computational Linguistics (2021)

26.

Cui, L., et al.: Knowledge enhanced fine-tuning for better handling unseen entities in dialogue generation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.2328–2337. Association for Computational Linguistics (2021)

Titel: CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation
verfasst von: Zhijie Qu
Juan Li
Zerui Ma
Jianqiang Li
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-52-5

Electronic ISBN: 978-981-9722-53-2

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2253-2_7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner