Voice of Customer in Auto Industry


It is common practice in any industry to ask their customer to review the products and the associated services. Sometimes reviews come in form of discussion in various discussion forum or web portal. The number of customer reviews can be in hundreds and thousands for popular products. This makes it difficult for the industry to evaluate them in order to get overall feedback and customer satisfaction level. Identifying customer requirements by analyzing opinions, comments and reviews have been a prime focus of many industries. Sentiment analysis using product review data [8], Opinion Mining and Sentiment Analysis [9], Product weakness finder [10], Sentiment Analysis: A Multi-Faceted Problem [11], Comparative Experiments on Sentiment Classification for Online Product Reviews [12], which form the basis of your work. Providing recommendations to the car industry based on insights extracted from mining web portal will make our project one of its kind as car industry has so far not been covered under any of the available studies. This project focuses on Indian Auto industry, which can benefit immensely by identifying the features that customers are expecting or not expecting in a vehicle. This project aims to provide this recommendation by analyzing the structured and unstructured text data available in Team BHP portal. This would help car manufacturers design their future product better and upgrade existing products to meet the needs of customers. We start by extracting text data from TeamBhp.com as is. We then clean the data by removing punctuations and stop words. The cleaned text data is tokenized by splitting in each word and perform lemmatization and stemming on the tokenized data and the results are visualized in word cloud and word frequency table. We then convert the data representations using bag of words (BOW) technique. And we move into experimental modeling using latent dirchlet allocation (LDA) to build a better model. Finally we start extracting the review sentiments by using textblob based unsupervised algorithm and score the data for customer likeliness and unlikeness.

Published in:

Proceedings of the Seventh International Conference on Business Analytics and Intelligence, December 2019, IISc, India.


Shewatha Arul

Sudeep Matthew

Anand Limbare

Saumyadip Sarkar

Sneha Tiwari

Leave a Reply

Your email address will not be published. Required fields are marked *