Skip to main content

2024 | OriginalPaper | Buchkapitel

Classification of Code-Mixed Tamil Text Using Deep Learning Algorithms

verfasst von : R. Theninpan, P. Valarmathi

Erschienen in: Computational Sciences and Sustainable Technologies

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Natural Language Processing (NLP) is a vast subject with applications in many fields in today’s modern world. The goal of NLP is to achieve human like language processing for a variety of activities or applications. The internet is full of textual data in many different languages. Although a large number of internet comments found in public spaces are often positive, a significant portion are toxic in nature. We first need to separate the good from the bad before classifying the different levels of toxicity. This will lessen any unintentional prejudice towards certain individuals or entity and lessen negativity on social media. Our primary goal is to identify, categorize, and analyze the toxicity that now plagues social media platforms. This study focuses on classifying Code Mixed Tamil text using deep learning algorithms. Tamil as a language has many obstacles to be overcome in this NLP task. Since Tamil’s grammar structure, specific features are unique and complex, it is actually hard to make a model that can consistently perform for any data from the language of Tamil. The agglutinative nature of Tamil is a major problem when it comes to tasks like classification since the context gets twisted when the single word is split into corresponding morphemes. Since there are many studies conducted on Code Mixed text of other languages with deep learning algorithms, this paper aims to find the effectiveness of XLNet and Bi-LSTM on Code-Mixed Tamil.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Nithya, K., Sathyapriya, S., Sulochana, M., Thaarini, S., Dhivyaa, C.R.: Deep learning based analysis on code-mixed Tamil text for sentiment classification with pre-trained ULMFiT. In: 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), pp. 1112–1116 (2022). https://doi.org/10.1109/ICCMC53470.2022.9754163 Nithya, K., Sathyapriya, S., Sulochana, M., Thaarini, S., Dhivyaa, C.R.: Deep learning based analysis on code-mixed Tamil text for sentiment classification with pre-trained ULMFiT. In: 2022 6th International Conference on Computing Methodologies and Communication (ICCMC), pp. 1112–1116 (2022). https://​doi.​org/​10.​1109/​ICCMC53470.​2022.​9754163
2.
3.
Zurück zum Zitat Subramanian, M., Adhithiya, G.J., Gowthamkrishnan, S., Deepti, R.: Detecting offensive Tamil texts using machine learning and multilingual transformer models. In: 2022 International Conference on Smart Technologies and Systems for Next Generation Computing (ICSTSN), pp. 1–6 (2022). https://doi.org/10.1109/ICSTSN53084.2022.9761335 Subramanian, M., Adhithiya, G.J., Gowthamkrishnan, S., Deepti, R.: Detecting offensive Tamil texts using machine learning and multilingual transformer models. In: 2022 International Conference on Smart Technologies and Systems for Next Generation Computing (ICSTSN), pp. 1–6 (2022). https://​doi.​org/​10.​1109/​ICSTSN53084.​2022.​9761335
6.
Zurück zum Zitat Sabri, N., Edalat, A., Bahrak, B.: Sentiment analysis of Persian-English code-mixed texts. In: 2021 26th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–4 (2021) Sabri, N., Edalat, A., Bahrak, B.: Sentiment analysis of Persian-English code-mixed texts. In: 2021 26th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–4 (2021)
9.
Zurück zum Zitat Ramraj, S., Arthi, R., Murugan, S., Julie, M.S.: Topic categorization of Tamil news articles using PreTrained Word2Vec embeddings with convolutional neural network. In: 2020 International Conference on Computational Intelligence for Smart Power System and Sustainable Energy (CISPSSE), pp. 1–4 (2020). https://doi.org/10.1109/CISPSSE49931.2020.9212248 Ramraj, S., Arthi, R., Murugan, S., Julie, M.S.: Topic categorization of Tamil news articles using PreTrained Word2Vec embeddings with convolutional neural network. In: 2020 International Conference on Computational Intelligence for Smart Power System and Sustainable Energy (CISPSSE), pp. 1–4 (2020). https://​doi.​org/​10.​1109/​CISPSSE49931.​2020.​9212248
11.
Metadaten
Titel
Classification of Code-Mixed Tamil Text Using Deep Learning Algorithms
verfasst von
R. Theninpan
P. Valarmathi
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-50993-3_23

Premium Partner