facebook-pixel

Development of a Resume Scoring Algorithm

Industry Professional Services

Specialization Or Business Function Human Resources (Job Applicant Scoring)

Technical Function Analytics (Predictive Modeling, Natural Language Processing)

Technology & Tools

WORK IN PROGRESS

Project Description

Background

We are a provider of eRecruitment technology which is used by our clients to manage the workflow of recruiting new hires including the following steps: posting vacancies, providing online application forms, integration of recruitment tests, communication with candidates etc.

For graduate hire processes we have developed a structured online form which captures job applicants’ biographical details such as education scores (GPA, SAT, ACT, GMAT), work experience history and also leadership and other achievements.  Our clients’ HR managers and recruiters use this information to shortlist a subset of applicants for interview.

The leadership and achievements section of the application form asks applicants to provide details of a maximum of ten important achievements or leadership experience gained through their studies, work experience or extra-curricular activities.  This information is provided by candidates selecting one of nineteen categories (e.g. national sports award, leadership position in university society), providing free text information on the name of the award or position, a separate free text box for the organisation/society name, and selecting from a prepopulated list other details such as university name and country.

We have created a target list of important achievements/leadership positions which are deemed prestigious or important by recruiters.  The free-text achievement data is pre-processed by matching via regex against this large target set of academic, sporting and leadership achievements which are considered desirable in an applicant.   Each matched achievement is given a descriptive meta-tag, for example “<sports award>”.  In addition some specific achievements will be tagged with an occurrence category (pre-university, undergraduate studies, etc.), university name, and/or country name.

 

In summary the data for each candidate usually includes the following:

Education scores:  University grade point average (GPA) for current course, pre-university national scores for the SAT or ACT tests.  

Work Experience information: Each employment or work experience usually includes employer name, job title, start and end dates, duration, self-selected category from internship/ work placement / permanent / temporary job etc.

Leadership and achievements:  free text matched against a target list of achievements.

The data can be provided in csv or xlsx format, containing a mix of numerical and text data for approx 50,000 applicants.

 

 

Goals

 We need assistance in developing the following:

An algorithm that can be used to predict the likelihood of an applicant being offered an interview and being offered a job.

A scoring or weighting system that can rank candidates in order of likely interview success.

The development of weightings for particular types of achievements as flagged by the meta-tags e.g. university sports awards, national community service awards.

Recommendations on longer term big data approaches to this problem including skills, applications and technologies.

One potential complexity is that candidates do not provide all relevant information.  For example education test scores such as SAT are missing, or the candidate has studied at a non-US university that does not provide GPA scores but education scores in another format.  So ideally a solution should account for missing data for example by not penalising applicants with an international education or qualifications.

Another complexity is that the algorithm should not adversely discriminate against applicants on the basis of gender or ethnicity.

The broader aim of the project is to develop a transparent scoring system that can be used to rank and ultimately predict the employability of recent graduates.  A key feature is that the weightings or scores for individual achievements are transparent and can be displayed via our recruitment system to our clients’ recruitment managers.  Hence our approach of creating a list of targeted achievements. 

However we recognize that this may not be the best approach and would welcome advice on the types of infrastructure and applications that we should be using for the longer term, for example whether using natural language processing and machine learning would be more appropriate.   For such purposes we have a significantly larger dataset (approx 10 million) of applicant CVs/resumes (in a less structured format) to analyse.

 

 

Deliverables

The ranking algorithm would need to be implemented within our proprietary system. It is likely the ranking algorithm solution is in Python or Perl.  Alternatives may be acceptable subject to discussion with our Technical Director.

Milestones/deadline

We are looking for a working algorithm that we can implement during Q2 2016.

 

 

Please see attachment for structured achievement forms. 

 

 

Project Overview

  • Posted
    January 22, 2016
  • Preferred Location
    From anywhere

Client Overview


EXPERTISE REQUIRED

Matching Providers