Identifying Kannada and English Code Switch Text
Abstract
The users are writing comments or reviews about different events due to the popular usage of social media and smart devices. These comments can be written in both monolingual and bilingual or Code-Switch (CS) text. Nowadays bilingual or code-switch text is common in social media along with monolingual text. Identifying these code-switch text is very important in emotion detection and sentiment analysis. In this paper we focused on the problem of identifying Kannada and English code-switch text by applying different supervised classification techniques. We applied “Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Neutral Network (NN), and Naïve Bayes (NB)” approaches. The experimental results shown that Naïve Bayes and Logistic Regression supervised classification techniques are more accurate than other classification techniques and SVM is least accurate for our data set.
Published in:
Proceedings of the Seventh International Conference on Business Analytics and Intelligence, December 2019, IISc, India.