An Innovative Hybrid Recommendation System Integrating Content-Based
Abstract:
The rapid increase in data generation by modern technology has ushered in a new era of information – the big data era. The huge amount of data has, however, led to the problem of Information Overload. The positive side is that the same data can be used as a boon and aids in creating efficient recommendation systems. As technology continues to advance, the importance of recommendation systems grows exponentially. With the ever-increasing volume of data generated on a daily basis, the demand for more accurate and efficient recommendation systems becomes even more pressing. A high performing recommendation system improves the quality of search and essentially predicts the preferences and interests of users and provides movies that are more relevant to the user.
In this research paper, we propose a novel hybrid movie recommendation system that combines the capabilities of both content-based and collaborative-based filtering techniques (Li & Han, 2021) while integrating Term Frequency-Inverse Document Frequency (TF-IDF) (Hu et al., 2016) and cosine similarity measures. Content-based recommendation systems offer recommendations based on the user’s past consumption and interaction history. In contrast, Collaborative-filtering recommendation filters by matching the similarities in users and movies. It looks at the characteristics of the users and the characteristics of the movies the users have watched or searched for before and then computes the measure of similarity between users. The main objective of this research is to enhance the accuracy and effectiveness of recommendation systems by overcoming the limitations of individual approaches and addressing the very notorious cold-start problem.
The proposed novel hybrid recommendation system leverages content-based filtering to analyze user preferences and item characteristics. This is accomplished by analyzing the user’s past consumption history. Then, TF-IDF is used to weigh the importance of terms in movie descriptions, thus capturing semantic information on each item (in our case – movie) more effectively. Additionally, the recommendation system incorporates collaborative-based filtering to include user-item interaction patterns and user similarities. This is achieved with the creation of two matrices. First, we construct the User-Item rating matrix – a matrix of movies the viewer has rated, liked, or searched. Second, we construct a User-User Item matrix by determining movies that are similar to the movies that the user has rated, liked, or searched before. The next step is to compute the cosine similarity scores to establish correlations (relationships) between the entities of each matrix individually. This cosine similarity metric is used to further quantify the likeness between users and items, which facilitates more precise and personal recommendations. The final step is to make recommendations based on correlation, recommending higher correlations.
To tackle the cold-start problem, in which there is a lack of historical interaction data on new users or items, we propose a hybrid cold-start strategy. The recommendation system intelligently combines content-based and collaborative-based techniques to generate targeted preliminary recommendations for cold-start users or items. As the number of interactions increases and accumulates, the recommendations are gradually personalized using collaborative filtering insights.
The effectiveness and performance of the proposed hybrid recommendation system are evaluated on a real-world dataset – The MovieLens Dataset, and comparisons are made against traditional content-based and collaborative-based approaches. Experimental results demonstrate that our hybrid system achieves superior recommendation accuracy, offering users more personalized and relevant suggestions while effectively mitigating the cold-start problem.
By incorporating a hybrid approach that intelligently combines content-based and collaborative-based techniques with the power of TF-IDF and cosine similarity, our research contributes to the development of more sophisticated and user-centric recommendation systems. Emphasizing the significance of overcoming the cold-start problem, our hybrid strategy ensures that new users and items receive relevant suggestions, promoting user engagement and satisfaction.
In conclusion, our research demonstrates the effectiveness of a hybrid approach that combines content-based and collaborative-based filtering techniques, incorporating TF-IDF and cosine similarity measures. This novel recommendation system represents a significant advancement in the field of movie recommendations, providing users with enhanced movie suggestions and overcoming challenges posed by the cold-start problem. As the era of big data continues to evolve, the development of innovative recommendation systems will play a crucial role in making the most of the enormous amount of information available and improving the overall user experience in a personalized manner.
Keywords: Movie Recommendation System, Content Based Filtering, Collaborative filtering, Hybrid Recommendation System, TF-IDF, Cosine Similarity, Cold Start Problem, Correlation.
Conference Name: 10th international Conference on Business Analytics and Intelligence (2023- ICBAI)