A Hybrid Resume Parser and Matcher using RegEx and NER

Abstract: In the context of a competitive job market, this research addresses the challenge of efficiently identifying suitable candidates among a vast pool of applicants. Leveraging machine learning and deep learning techniques, we propose a solution to expedite the candidate selection process. Our approach involves scoring resumes based on extracted information and presenting them in descending order of relevance to the job description, thereby streamlining the recruiter’s initial contact efforts.Two primary tasks are undertaken: information extraction and resume scoring. Information retrieval is achieved by a hybrid approach combining rule-based methods and Named Entity Recognition (NER) using Natural Language Processing (NLP). Key attributes such as Name, Phone number, Email id, University, Experience, Skills, and Organization are extracted using a combination of RegEx and Spacy’s pre-trained transformer model. To calculate the score, we employ cosine similarity between the candidate’s resume and the job description. This similarity metric measures the candidate’s suitability for the specified job role. Notably, we utilize Sentence Bidirectional Encoder Representations from Transformers (SBERT) for resume and job description vectorization, achieving an impressive parsing accuracy of 70%. In summary, our research offers an automated solution that significantly enhances the efficiency of candidate selection by prioritizing applicants based on their suitability, saving valuable time for the organization and recruiters.

Keywords: Hybrid Resume Parsing, Text mining, Information extraction, Text Matching, Scoring, Named Entity Recognition

Published in: 2023 International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT)

AUTHORS

Gurushanth Murthy G R


Dr. J B Simha


Professor and Chief Mentor – AI and CTO, ABIBA Systems

Dr. Rashmi Agarwal


Associate Professor

Leave a Reply

Your email address will not be published. Required fields are marked *