Development of Analytical Data Mart and Data Pipeline for Recruitment Analytics



Business research is producing new knowledge at an astounding rate, yet it is still fragmented and heterogeneous.  This makes it challenging to stay up to date, be at the forefront of research, and interpret the data in context in a particular area of business research. Because of this, it is more crucial than ever as a research strategy. Traditional literature reviews usually lack thoroughness and rigour and are conducted haphazardly rather than in accordance with a predetermined methodology.  As a result, questions about the validity and reliability of these kinds of evaluations may be brought up. Despite the value of conducting Systematic Literature Reviews (SLR) of the literature to pinpoint research needs across various disciplines of study, manually doing SLR is a challenging, multi-stage, and time-consuming process.  The primary goal of this research topic is to extract the reference paper and summarize it based on existing research papers in the same area and generate a Summary of the text which prevents duplication.  The text has been taken from different papers and after that with the help of different regular expressions collected data was cleaned from unnecessary words or punctuation marks. On the cleaned text different NLP techniques (Text Rank, Lex Rank, LSA, Luhn, KL Sum, BERT, GPT-2) have been utilized for summarization of the text. Out of all NLP techniques it has been observed that GPT and BERT were giving the best result.  On a selected topic, different research papers have been explored. Literature reviews of different research papers have been taken and summarized for reference to new researchers in that field. In this way, this model reduces the research time of different researchers and gives them an idea of previous research which has taken place on this topic in recent years.


Keywords: Text Summarization, Transformer, Research Paper, Extractive Summarization, Transformer, NLP


Journal Name:  Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering


Ashish Chandra Jha

Sanjeev Kumar Jha

J B Simha

Leave a Reply

Your email address will not be published. Required fields are marked *