Skip to main content

2024 | OriginalPaper | Buchkapitel

DALLMi: Domain Adaption for LLM-Based Multi-label Classifier

verfasst von : Miruna Bețianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Large language models (LLMs) increasingly serve as the backbone for classifying text associated with distinct domains and simultaneously several labels (classes). When encountering domain shifts, e.g., classifier of movie reviews from IMDb to Rotten Tomatoes, adapting such an LLM-based multi-label classifier is challenging due to incomplete label sets at the target domain and daunting training overhead. The existing domain adaptation methods address either image multi-label classifiers or text binary classifiers. In this paper, we design DALLMi, Domain Adaptation Large Language Model interpolator, a first-of-its-kind semi-supervised domain adaptation method for text data models based on LLMs, specifically BERT. The core of DALLMi is the novel variation loss and MixUp regularization, which jointly leverage the limited positively labeled and large quantity of unlabeled text and, importantly, their interpolation from the BERT word embeddings. DALLMi also introduces a label-balanced sampling strategy to overcome the imbalance between labeled and unlabeled data. We evaluate DALLMi against the partial-supervised and unsupervised approach on three datasets under different scenarios of label availability for the target domain. Our results show that DALLMi achieves higher mAP than unsupervised and partially-supervised approaches by 19.9% and 52.2%, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719–760 (2020)MathSciNetCrossRef Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719–760 (2020)MathSciNetCrossRef
2.
Zurück zum Zitat Buonocore, T.M., Crema, C., Redolfi, A., Bellazzi, R., Parimbelli, E.: Localizing in-domain adaptation of transformer-based biomedical language models. J. Biomed. Informatics 144, 104431 (2023)CrossRef Buonocore, T.M., Crema, C., Redolfi, A., Bellazzi, R., Parimbelli, E.: Localizing in-domain adaptation of transformer-based biomedical language models. J. Biomed. Informatics 144, 104431 (2023)CrossRef
3.
Zurück zum Zitat Cartucho, J., Ventura, R., Veloso, M.: Robust object recognition through symbiotic deep learning in mobile robots. In: IEEE/RSJ IROS, pp. 2336–2341 (2018) Cartucho, J., Ventura, R., Veloso, M.: Robust object recognition through symbiotic deep learning in mobile robots. In: IEEE/RSJ IROS, pp. 2336–2341 (2018)
4.
Zurück zum Zitat Chen, H., Liu, F., Wang, Y., Zhao, L., Wu, H.: A variational approach for learning from positive and unlabeled data. In: NeurIPS, vol. 33, pp. 14844–14854 (2020) Chen, H., Liu, F., Wang, Y., Zhao, L., Wu, H.: A variational approach for learning from positive and unlabeled data. In: NeurIPS, vol. 33, pp. 14844–14854 (2020)
5.
Zurück zum Zitat Chronopoulou, A., Peters, M.E., Dodge, J.: Efficient hierarchical domain adaptation for pretrained language models. In: NAACL, pp. 1336–1351 (2022) Chronopoulou, A., Peters, M.E., Dodge, J.: Efficient hierarchical domain adaptation for pretrained language models. In: NAACL, pp. 1336–1351 (2022)
6.
Zurück zum Zitat Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Najada, H.A.: Survey of review spam detection using machine learning techniques. J. Big Data 2, 23 (2015)CrossRef Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Najada, H.A.: Survey of review spam detection using machine learning techniques. J. Big Data 2, 23 (2015)CrossRef
7.
Zurück zum Zitat Deng, A., Wu, Y., Zhang, P., Lu, Z., Li, W., Su, Z.: A weakly supervised framework for real-world point cloud classification. Comput. Graph. 102, 78–88 (2022)CrossRef Deng, A., Wu, Y., Zhang, P., Lu, Z., Li, W., Su, Z.: A weakly supervised framework for real-world point cloud classification. Comput. Graph. 102, 78–88 (2022)CrossRef
8.
Zurück zum Zitat Eastwood, C., Mason, I., Williams, C.K.I., Schölkopf, B.: Source-free adaptation to measurement shift via bottom-up feature restoration. In: ICLR (2022) Eastwood, C., Mason, I., Williams, C.K.I., Schölkopf, B.: Source-free adaptation to measurement shift via bottom-up feature restoration. In: ICLR (2022)
9.
Zurück zum Zitat Grangier, D., Iter, D.: The trade-offs of domain adaptation for neural language models. In: ACL, pp. 3802–3813 (2022) Grangier, D., Iter, D.: The trade-offs of domain adaptation for neural language models. In: ACL, pp. 3802–3813 (2022)
10.
Zurück zum Zitat Guo, Y., Rennard, V., Xypolopoulos, C., Vazirgiannis, M.: Bertweetfr: domain adaptation of pre-trained language models for French tweets. In: W-NUT, pp. 445–450 (2021) Guo, Y., Rennard, V., Xypolopoulos, C., Vazirgiannis, M.: Bertweetfr: domain adaptation of pre-trained language models for French tweets. In: W-NUT, pp. 445–450 (2021)
11.
Zurück zum Zitat Lee, L.H., Wan, C.H., Rajkumar, R., Isa, D.: An enhanced support vector machine classification framework by using euclidean distance function for text document categorization. Appl. Intell. 37(1), 80–99 (2012)CrossRef Lee, L.H., Wan, C.H., Rajkumar, R., Isa, D.: An enhanced support vector machine classification framework by using euclidean distance function for text document categorization. Appl. Intell. 37(1), 80–99 (2012)CrossRef
12.
Zurück zum Zitat Liu, H., Long, M., Wang, J., Wang, Y.: Learning to adapt to evolving domains. In: NeurIPS, vol. 33, pp. 22338–22348 (2020) Liu, H., Long, M., Wang, J., Wang, Y.: Learning to adapt to evolving domains. In: NeurIPS, vol. 33, pp. 22338–22348 (2020)
13.
Zurück zum Zitat Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: IEEE ICCV, pp. 5716–5726 (2017) Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: IEEE ICCV, pp. 5716–5726 (2017)
14.
Zurück zum Zitat Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: K-CAP, pp. 70–77 (2003) Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: K-CAP, pp. 70–77 (2003)
15.
Zurück zum Zitat Pham, D.D., Koesnadi, S.M., Dovletov, G., Pauli, J.: Unsupervised adversarial domain adaptation for multi-label classification of chest x-ray. In: IEEE ISBI, pp. 1236–1240 (2021) Pham, D.D., Koesnadi, S.M., Dovletov, G., Pauli, J.: Unsupervised adversarial domain adaptation for multi-label classification of chest x-ray. In: IEEE ISBI, pp. 1236–1240 (2021)
16.
Zurück zum Zitat Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. MIT Press (2022) Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. MIT Press (2022)
17.
Zurück zum Zitat Rietzler, A., Stabinger, S., Opitz, P., Engl, S.: Adapt or get left behind: domain adaptation through BERT language model finetuning for aspect-target sentiment classification. In: LREC, pp. 4933–4941 (2020) Rietzler, A., Stabinger, S., Opitz, P., Engl, S.: Adapt or get left behind: domain adaptation through BERT language model finetuning for aspect-target sentiment classification. In: LREC, pp. 4933–4941 (2020)
18.
Zurück zum Zitat Ryu, M., Lee, G., Lee, K.: Knowledge distillation for BERT unsupervised domain adaptation. Knowl. Inf. Syst. 64(11), 3113–3128 (2022)CrossRef Ryu, M., Lee, G., Lee, K.: Knowledge distillation for BERT unsupervised domain adaptation. Knowl. Inf. Syst. 64(11), 3113–3128 (2022)CrossRef
19.
Zurück zum Zitat Sachidananda, V., Kessler, J.S., Lai, Y.: Efficient domain adaptation of language models via adaptive tokenization. In: SustaiNLP@EMNLP, pp. 155–165 (2021) Sachidananda, V., Kessler, J.S., Lai, Y.: Efficient domain adaptation of language models via adaptive tokenization. In: SustaiNLP@EMNLP, pp. 155–165 (2021)
20.
Zurück zum Zitat Singh, I.P., Ghorbel, E., Kacem, A., Rathinam, A., Aouada, D.: Discriminator-free unsupervised domain adaptation for multi-label image classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023) Singh, I.P., Ghorbel, E., Kacem, A., Rathinam, A., Aouada, D.: Discriminator-free unsupervised domain adaptation for multi-label image classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)
21.
Zurück zum Zitat Singhal, P., Walambe, R., Ramanna, S., Kotecha, K.: Domain adaptation: challenges, methods, datasets, and applications. IEEE Access 11, 6973–7020 (2023)CrossRef Singhal, P., Walambe, R., Ramanna, S., Kotecha, K.: Domain adaptation: challenges, methods, datasets, and applications. IEEE Access 11, 6973–7020 (2023)CrossRef
23.
Zurück zum Zitat Verma, V., et al.: Manifold Mixup: better representations by interpolating hidden states. In: ICML, vol. 97, pp. 6438–6447 (2019) Verma, V., et al.: Manifold Mixup: better representations by interpolating hidden states. In: ICML, vol. 97, pp. 6438–6447 (2019)
24.
Zurück zum Zitat Wang, D., Shelhamer, E., Liu, S., Olshausen, B.A., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: ICLR (2021) Wang, D., Shelhamer, E., Liu, S., Olshausen, B.A., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: ICLR (2021)
25.
Zurück zum Zitat Yuan, Z., Zhang, K., Huang, T.: Positive label is all you need for multi-label classification. arXiv preprint arXiv:2306.16016 (2023) Yuan, Z., Zhang, K., Huang, T.: Positive label is all you need for multi-label classification. arXiv preprint arXiv:​2306.​16016 (2023)
26.
Zurück zum Zitat Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018) Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)
27.
Zurück zum Zitat Zhang, Y., Zhang, H., Deng, B., Li, S., Jia, K., Zhang, L.: Semi-supervised models are strong unsupervised domain adaptation learners. arXiv preprint arXiv:2106.00417 (2021) Zhang, Y., Zhang, H., Deng, B., Li, S., Jia, K., Zhang, L.: Semi-supervised models are strong unsupervised domain adaptation learners. arXiv preprint arXiv:​2106.​00417 (2021)
Metadaten
Titel
DALLMi: Domain Adaption for LLM-Based Multi-label Classifier
verfasst von
Miruna Bețianu
Abele Mălan
Marco Aldinucci
Robert Birke
Lydia Chen
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2259-4_21

Premium Partner