nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

`DALLMi`: Domain Adaption for LLM-Based Multi-label Classifier

verfasst von : Miruna Bețianu, Abele Mălan, Marco Aldinucci, Robert Birke, Lydia Chen

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Large language models (LLMs) increasingly serve as the backbone for classifying text associated with distinct domains and simultaneously several labels (classes). When encountering domain shifts, e.g., classifier of movie reviews from IMDb to Rotten Tomatoes, adapting such an LLM-based multi-label classifier is challenging due to incomplete label sets at the target domain and daunting training overhead. The existing domain adaptation methods address either image multi-label classifiers or text binary classifiers. In this paper, we design DALLMi, Domain Adaptation Large Language Model interpolator, a first-of-its-kind semi-supervised domain adaptation method for text data models based on LLMs, specifically BERT. The core of DALLMi is the novel variation loss and MixUp regularization, which jointly leverage the limited positively labeled and large quantity of unlabeled text and, importantly, their interpolation from the BERT word embeddings. DALLMi also introduces a label-balanced sampling strategy to overcome the imbalance between labeled and unlabeled data. We evaluate DALLMi against the partial-supervised and unsupervised approach on three datasets under different scenarios of label availability for the target domain. Our results show that DALLMi achieves higher mAP than unsupervised and partially-supervised approaches by 19.9% and 52.2%, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Backdoor Attack Against One-Class Sequential Anomaly Detection Models

Nächstes Kapitel Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration

Each label represents a possible class.

https://pubmed.ncbi.nlm.nih.gov.

https://arxiv.org.

https://huggingface.co/bert-base-uncased.

Bekker, J., Davis, J.: Learning from positive and unlabeled data: a survey. Mach. Learn. 109(4), 719–760 (2020)MathSciNetCrossRef

Buonocore, T.M., Crema, C., Redolfi, A., Bellazzi, R., Parimbelli, E.: Localizing in-domain adaptation of transformer-based biomedical language models. J. Biomed. Informatics 144, 104431 (2023)CrossRef

Cartucho, J., Ventura, R., Veloso, M.: Robust object recognition through symbiotic deep learning in mobile robots. In: IEEE/RSJ IROS, pp. 2336–2341 (2018)

Chen, H., Liu, F., Wang, Y., Zhao, L., Wu, H.: A variational approach for learning from positive and unlabeled data. In: NeurIPS, vol. 33, pp. 14844–14854 (2020)

Chronopoulou, A., Peters, M.E., Dodge, J.: Efficient hierarchical domain adaptation for pretrained language models. In: NAACL, pp. 1336–1351 (2022)

Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Najada, H.A.: Survey of review spam detection using machine learning techniques. J. Big Data 2, 23 (2015)CrossRef

Deng, A., Wu, Y., Zhang, P., Lu, Z., Li, W., Su, Z.: A weakly supervised framework for real-world point cloud classification. Comput. Graph. 102, 78–88 (2022)CrossRef

Eastwood, C., Mason, I., Williams, C.K.I., Schölkopf, B.: Source-free adaptation to measurement shift via bottom-up feature restoration. In: ICLR (2022)

Grangier, D., Iter, D.: The trade-offs of domain adaptation for neural language models. In: ACL, pp. 3802–3813 (2022)

10.

Guo, Y., Rennard, V., Xypolopoulos, C., Vazirgiannis, M.: Bertweetfr: domain adaptation of pre-trained language models for French tweets. In: W-NUT, pp. 445–450 (2021)

11.

Lee, L.H., Wan, C.H., Rajkumar, R., Isa, D.: An enhanced support vector machine classification framework by using euclidean distance function for text document categorization. Appl. Intell. 37(1), 80–99 (2012)CrossRef

12.

Liu, H., Long, M., Wang, J., Wang, Y.: Learning to adapt to evolving domains. In: NeurIPS, vol. 33, pp. 22338–22348 (2020)

13.

Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: IEEE ICCV, pp. 5716–5726 (2017)

14.

Nasukawa, T., Yi, J.: Sentiment analysis: capturing favorability using natural language processing. In: K-CAP, pp. 70–77 (2003)

15.

Pham, D.D., Koesnadi, S.M., Dovletov, G., Pauli, J.: Unsupervised adversarial domain adaptation for multi-label classification of chest x-ray. In: IEEE ISBI, pp. 1236–1240 (2021)

16.

Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. MIT Press (2022)

17.

Rietzler, A., Stabinger, S., Opitz, P., Engl, S.: Adapt or get left behind: domain adaptation through BERT language model finetuning for aspect-target sentiment classification. In: LREC, pp. 4933–4941 (2020)

18.

Ryu, M., Lee, G., Lee, K.: Knowledge distillation for BERT unsupervised domain adaptation. Knowl. Inf. Syst. 64(11), 3113–3128 (2022)CrossRef

19.

Sachidananda, V., Kessler, J.S., Lai, Y.: Efficient domain adaptation of language models via adaptive tokenization. In: SustaiNLP@EMNLP, pp. 155–165 (2021)

20.

Singh, I.P., Ghorbel, E., Kacem, A., Rathinam, A., Aouada, D.: Discriminator-free unsupervised domain adaptation for multi-label image classification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)

21.

Singhal, P., Walambe, R., Ramanna, S., Kotecha, K.: Domain adaptation: challenges, methods, datasets, and applications. IEEE Access 11, 6973–7020 (2023)CrossRef

22.

Sun, X., et al.: Text classification via large language models. In: EMNLP 2023 Findings (2023). https://aclanthology.org/2023.findings-emnlp.603/

23.

Verma, V., et al.: Manifold Mixup: better representations by interpolating hidden states. In: ICML, vol. 97, pp. 6438–6447 (2019)

24.

Wang, D., Shelhamer, E., Liu, S., Olshausen, B.A., Darrell, T.: Tent: fully test-time adaptation by entropy minimization. In: ICLR (2021)

25.

Yuan, Z., Zhang, K., Huang, T.: Positive label is all you need for multi-label classification. arXiv preprint arXiv:2306.16016 (2023)

26.

Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)

27.

Zhang, Y., Zhang, H., Deng, B., Li, S., Jia, K., Zhang, L.: Semi-supervised models are strong unsupervised domain adaptation learners. arXiv preprint arXiv:2106.00417 (2021)

Titel: DALLMi: Domain Adaption for LLM-Based Multi-label Classifier
verfasst von: Miruna Bețianu
Abele Mălan
Marco Aldinucci
Robert Birke
Lydia Chen
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-61-7

Electronic ISBN: 978-981-9722-59-4

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2259-4_21

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner