Skip to main content

2024 | OriginalPaper | Buchkapitel

CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation

verfasst von : Zhijie Qu, Juan Li, Zerui Ma, Jianqiang Li

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Medical dialogue generation relies on natural language generation techniques to enable online medical consultations. Recently, the widespread adoption of large-scale models in the field of natural language processing has facilitated rapid advancements in this technology. Existing medical dialogue models are mostly based on BERT and pre-trained on English corpora, but there is a lack of high-performing models on the task of Chinese medical dialogue generation. To solve the above problem, this paper proposes CMed-GPT, which is the GPT pre-training language model based on Chinese medical domain text. The model is available in two versions, namely, base and large, with corresponding perplexity values of 8.64 and 8.01. Additionally, we incorporate lexical and entity embeddings into the dialogue text in a uniform manner to meet the requirements of downstream dialogue generation tasks. By applying both fine-tuning and p-tuning to CMed-GPT, we lowered the PPL from 8.44 to 7.35. This study not only confirms the exceptional performance of the CMed-GPT model in generating Chinese biomedical text but also highlights the advantages of p-tuning over traditional fine-tuning with prefix prompts. Furthermore, we validate the significance of incorporating external information in medical dialogue generation, which enhances the quality of dialogue generation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Liu, W., Tang, J., Qin, J., Xu, L., Liang, X.: MedDG: A large-scale medical consultation dataset for building medical dialogue system (2020). arXiv preprint arXiv:2010.07497 Liu, W., Tang, J., Qin, J., Xu, L., Liang, X.: MedDG: A large-scale medical consultation dataset for building medical dialogue system (2020). arXiv preprint arXiv:​2010.​07497
3.
Zurück zum Zitat Li, D., et al.: Semi-supervised variational reasoning for medical dialogue generation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 544–554. Association for Computing Machinery, New York (2021) Li, D., et al.: Semi-supervised variational reasoning for medical dialogue generation. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 544–554. Association for Computing Machinery, New York (2021)
4.
Zurück zum Zitat Wei, Z., et al.: Task-oriented dialogue system for automatic Diagnosis. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 201–207. Association for Computational Linguistics, Melbourne (2018) Wei, Z., et al.: Task-oriented dialogue system for automatic Diagnosis. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 201–207. Association for Computational Linguistics, Melbourne (2018)
5.
Zurück zum Zitat Xu, L., Zhou Q., Gong, K., Liang, X., Tang, J., Lin, L.: End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7346–7353. Association for the Advancement of Artificial Intelligence (2019) Xu, L., Zhou Q., Gong, K., Liang, X., Tang, J., Lin, L.: End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7346–7353. Association for the Advancement of Artificial Intelligence (2019)
6.
Zurück zum Zitat Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics, Brussels (2018) Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp. 353–355. Association for Computational Linguistics, Brussels (2018)
7.
Zurück zum Zitat Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp.58–65. Association for Computational Linguistics, Florence (2019) Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp.58–65. Association for Computational Linguistics, Florence (2019)
8.
Zurück zum Zitat Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3(1), 1–23 (2022)CrossRef Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3(1), 1–23 (2022)CrossRef
9.
Zurück zum Zitat Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. J. Leukoc. Biol. 36(4), 1234–1240 (2020)MathSciNet Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. J. Leukoc. Biol. 36(4), 1234–1240 (2020)MathSciNet
10.
Zurück zum Zitat Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp.3615–3620. Association for Computational Linguistics, Hong Kong (2019) Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp.3615–3620. Association for Computational Linguistics, Hong Kong (2019)
11.
Zurück zum Zitat Roitero, K., et al.: DiLBERT: cheap embeddings for disease related medical NLP. IEEE Access 9(9), 2169–3536 (2021) Roitero, K., et al.: DiLBERT: cheap embeddings for disease related medical NLP. IEEE Access 9(9), 2169–3536 (2021)
13.
Zurück zum Zitat Zhang, N., Jia, Q., Yin, K., Dong, L., Gao, F., Hua, N.: Conceptualized Representation Learning for Chinese Biomedical Text Mining (2020). arXiv preprint arXiv:2008.10813 Zhang, N., Jia, Q., Yin, K., Dong, L., Gao, F., Hua, N.: Conceptualized Representation Learning for Chinese Biomedical Text Mining (2020). arXiv preprint arXiv:​2008.​10813
14.
Zurück zum Zitat Zhang, T., Cai, Z., Wang, C., Qiu, M., Yang, B., He, X.: SMedBERT: a knowledge-enhanced pre-trained language model with structured semantics for medical text mining. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.5882–5893. Association for Computational Linguistics (2021) Zhang, T., Cai, Z., Wang, C., Qiu, M., Yang, B., He, X.: SMedBERT: a knowledge-enhanced pre-trained language model with structured semantics for medical text mining. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.5882–5893. Association for Computational Linguistics (2021)
15.
Zurück zum Zitat He, B., et al.: BERT-MK: integrating graph contextualized knowledge into pre-trained language models. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp.2281–2290. Association for Computational Linguistics (2020) He, B., et al.: BERT-MK: integrating graph contextualized knowledge into pre-trained language models. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp.2281–2290. Association for Computational Linguistics (2020)
16.
Zurück zum Zitat Radford, A., et al.: Language models are unsupervised multitask learners. GPT-2 OpenAI blog (2019) Radford, A., et al.: Language models are unsupervised multitask learners. GPT-2 OpenAI blog (2019)
17.
Zurück zum Zitat Brown, T.B., et al.: Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, pp.1877–1901. Curran Associates Inc, Red Hook (2020) Brown, T.B., et al.: Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, pp.1877–1901. Curran Associates Inc, Red Hook (2020)
18.
19.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp.6000–6010. Curran Associates Inc., Red Hook (2017) Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp.6000–6010. Curran Associates Inc., Red Hook (2017)
20.
Zurück zum Zitat Loshchilov, I., Hutter, H.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2017) Loshchilov, I., Hutter, H.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2017)
21.
Zurück zum Zitat Peng, X., et al.: Fine-Tuning a transformer-based language model to avoid generating non-normative text (2020). arXiv preprint arXiv:2001.08764v1 Peng, X., et al.: Fine-Tuning a transformer-based language model to avoid generating non-normative text (2020). arXiv preprint arXiv:​2001.​08764v1
22.
Zurück zum Zitat Davier, M.V., Training optimus prime, M.D.: Generating medical certification items by Fine-Tuning OpenAI’s gpt2 transformer model (2019). arXiv preprint arXiv:1908.08594 Davier, M.V., Training optimus prime, M.D.: Generating medical certification items by Fine-Tuning OpenAI’s gpt2 transformer model (2019). arXiv preprint arXiv:​1908.​08594
23.
Zurück zum Zitat Tsai, D.C.L., et al.: Short answer questions generation by Fine-Tuning BERT and GPT-2. In: 29th International Conference on Computers in Education Conference, pp. 509–515. Asia-Pacific Society for Computers in Education (2021) Tsai, D.C.L., et al.: Short answer questions generation by Fine-Tuning BERT and GPT-2. In: 29th International Conference on Computers in Education Conference, pp. 509–515. Asia-Pacific Society for Computers in Education (2021)
24.
Zurück zum Zitat Li, X., Liang, P.: Prefix-Tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.4582–4597. Association for Computational Linguistics (2021) Li, X., Liang, P.: Prefix-Tuning: optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp.4582–4597. Association for Computational Linguistics (2021)
25.
Zurück zum Zitat Lester, B., et al.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.3045–3059. Association for Computational Linguistics (2021) Lester, B., et al.: The power of scale for parameter-efficient prompt tuning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.3045–3059. Association for Computational Linguistics (2021)
26.
Zurück zum Zitat Cui, L., et al.: Knowledge enhanced fine-tuning for better handling unseen entities in dialogue generation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.2328–2337. Association for Computational Linguistics (2021) Cui, L., et al.: Knowledge enhanced fine-tuning for better handling unseen entities in dialogue generation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp.2328–2337. Association for Computational Linguistics (2021)
Metadaten
Titel
CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation
verfasst von
Zhijie Qu
Juan Li
Zerui Ma
Jianqiang Li
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2253-2_7

Premium Partner