{"id":1097,"date":"2019-02-15T10:31:58","date_gmt":"2019-02-15T10:31:58","guid":{"rendered":"http:\/\/kusuaks7\/?p=702"},"modified":"2023-08-08T11:53:08","modified_gmt":"2023-08-08T11:53:08","slug":"predictive-analytics-in-hr","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/predictive-analytics-in-hr\/","title":{"rendered":"Predictive Analytics in HR"},"content":{"rendered":"<h3>Introduction<\/h3>\n<p>My 10+ years professional track record within the European Institutions have given me the possibility to participate in major projects as the eParliament Program, MEMMOIRE, and a number of inter-institutional projects. These\u00a0experiences helped me grow my skills in business process analysis, technical writing, and agile prototyping.<\/p>\n<p>Driven by my passion for Business Intelligence, Machine Learning, and Data Mining, I was mostly involved in projects where I was able to implement predictive analytics and build a number of custom-developed Business Intelligence solutions that are widely used within the European Institutions.<\/p>\n<p>Through these blog writings I hope to share my passion for Data Mining as well as some\u00a0business and technical insights I have gathered throughout my career.<\/p>\n<p>In this first article I will discuss the role of predictive analytics in Human Resources while explaining some of the basics of analytics and data mining. Human Resources is a field where predictive analytics can add enormous value.<\/p>\n<h3>Why HR Needs Analytics<\/h3>\n<p>All organisations hire employees and subcontractors. These employees work and evolve within a company, passing through different phases of their work and personal life. Employees have relationships and communication patterns and they\u00a0engage in different activities that we can use to learn about them.<\/p>\n<p>During the recruitment stage the candidate takes a psychoanalytic test. This test will try to describe the person\u2019s profile and how he or she will perform within the organisation.<\/p>\n<p>Once the candidate is hired, his\/her manager will prepare yearly or quarterly reports on the progress of that individual. This would show how well the employee performs in executing the tasks assigned to him. Ideally these reports would be compared to the initial profile to see\u00a0if expectations are met.<\/p>\n<p>As businesses evolve it is a challenge to keep employees motivated and prevent them from leaving. Employees need to be motivated and challenged as they go through different phases of their careers and personal life. The Hackett Group says the cost to replace someone is between 80 and 120 percent of an employee\u2019s fully loaded costs (in other words, total pay plus benefits and overheads). It is clear that talent is the biggest asset of a company and career guidance and talent retention are key values to business success.<\/p>\n<h3>An Overview\u00a0of the Methodology<\/h3>\n<p>Our approach is to pinpoint the profiles and the points in time in which employees have a tendency to lose motivation. We will try to find patterns in our data and draft a description of employees that need attention.<\/p>\n<p>Let\u2019s imagine we have a career guidance unit within our organization that helps employees during these phases of their career. Every meeting between the employee and the career guidance unit is well documented and encoded in an internal system. This gives us the possibility to craft a 360\u00b0 profile of a \u2018customer\u2019 of the unit. Once this 360\u00b0 profile is defined we can use it to find potential \u2018customers\u2019 and take necessary actions early on.<\/p>\n<p>Let\u2019s first start by a short description of the different types of statistics.<\/p>\n<p><strong>Descriptive<\/strong> statistics will show a snapshot at a given time period and ideally we can compare them\u00a0with a past situation to describe evolution.<\/p>\n<p>In <strong>diagnostic<\/strong> statistics we can search for relations in the data and\u00a0try to explain tendencies. Using obtained knowledge and existing data we can build a model and make <strong>predictions<\/strong>. We can predict the success rate of a candidate, the evolution of a customer, the amount of demand,\u00a0etc.<\/p>\n<p>Statistical reports are usually based on a static model created upon business request. Reports are generated on a regular basis to be presented to the parties concerned. New report creation is usually time and labor consuming.<\/p>\n<h3>Description of the Tools<\/h3>\n<p>One way of solving the static reports issue is using <strong>OLAP<\/strong> (online analytical processing) cubes. There are a number of open source projects implementing OLAP cubes, such as OLAP4J and Mondrian.<\/p>\n<p>Mondrian is an engine for analytics. It accepts analytical queries and converts them into relational queries, returning the data in a form that supports analytics. But for Mondrian to be useful to business users, it needs some sort of interface and application to run it. There are a number of common ways that Mondrian can be deployed. It can be set up in an application server to run on its own and provide services. This approach tends to not be very user-friendly because it doesn\u2019t have a nice graphical user interface. Mondrian can also be embedded in a standalone application.<\/p>\n<p>Mondrian works with multidimensional <strong>cubes<\/strong>. A cube is a collection of dimensions and measures, all centered on a <strong>fact table<\/strong>. A cube is the context for a report or interactive analysis. A fact table holds the transactions and the relations to the other tables. In our case a <strong>fact table<\/strong> represents the <strong>appointments<\/strong> and the employee\u00a0attributes such as <strong>age<\/strong>, <strong>gender<\/strong>, <strong>position<\/strong>, <strong>education,<\/strong> etc.<\/p>\n<p>Cubes are composed of a <strong>Schema<\/strong>, <strong>Measures<\/strong> and <strong>Dimensions<\/strong>. <strong>Measures<\/strong> are the values used for the analysis, such as quantity ordered, sales, inventory, number of website visits, etc. <strong>Dimensions<\/strong> are attributes about the data for analysis, such as time (Yearly, Quarterly, and Monthly), geography (Branches, Location), employee demographics (Age, Gender) and so on.<\/p>\n<p>Cubes tend to be a complete set of measures and attributes for doing an analysis on the set of events. For instance, if you\u2019re interested in meetings by employee attributes, you might want to look at the number of meetings (events or measures) by employee\u2019s age (attributes). A cube collects these things into one place, ready for analyzing and querying.<\/p>\n<p>The illustration bellow shows the use of OLAP system.<\/p>\n<p>&nbsp;<\/p>\n<h3>Solution<\/h3>\n<p>In order to create a 360\u00b0 profile of our &#8220;employee customers&#8221; we will identify the <strong>measures<\/strong> and the <strong>dimensions<\/strong> that correspond to our employees&#8217;\u00a0<strong>attributes<\/strong>. For that matter we need to find as much information as we can from every employee who\u00a0visited the career guidance unit.\u00a0 These tables will usually\u00a0hold a large amount of data and\u00a0tend to become computationally expensive. Furthermore, you risk encountering performance issues once working with real-life data.<\/p>\n<p>It is recommended that you create Materialized Views and refresh those using jobs on a regular basis. The Materialized View will represent the data as a flat table that we will use as a fact table.<\/p>\n<p>For our example we will create a fact table called <strong>employee<\/strong>. Attributes such as <strong>marital status<\/strong>, <strong>gender<\/strong>, <strong>management role<\/strong> will be used to create the dimensions of the cube. \u00a0In the real world you will need to identify the respective tables that hold the relative information and join them.<\/p>\n<p>&nbsp;<\/p>\n<p>Now that we have our fact table, let&#8217;s go ahead and create the cube holding the dimensions and measures.\u00a0 Our cube named HR will be constructed from\u00a0the Dimension <strong>Employee<\/strong> with the attributes : <strong>Manager Id, Employee Id, Store Id, Marital Status, Gender, Salary, Education Level,<\/strong> etc.<\/p>\n<p>The cube is created using XML mark-up and looks like the image below. The attributes of the dimension reference to the respective columns of the table employee:<\/p>\n<p>&nbsp;<\/p>\n<p>The attribute <strong>Manager Id<\/strong> maps to the employee table with the <strong>supervisor_id<\/strong> to find the Manager of the given Employee. As you can see, the name appearing on the report can be different from the column name. This gives you the liberty to provide names that are easily understood by the business users.<\/p>\n<p>Within the dimensions, data can be viewed by level, such as department, store, and time. Once deployed on the OLAP, application users can drag the dimensions and measures to the canvas. There\u2019s no need to understand the structure of the database or use a query language to do analysis.<\/p>\n<p>&nbsp;<\/p>\n<p>For now we will introduce the measure <strong>Count<\/strong>\u00a0which\u00a0represents the number of employees who\u00a0have been in touch with our career guidance unit. \u00a0The Count measure is created by an aggregator that will count the number of <strong><em>employee_id<\/em><\/strong> in our table. Since\u00a0<strong><em>employee_id<\/em><\/strong> is unique, we are confident that the measure will provide the correct number of employees.<\/p>\n<p>&nbsp;<\/p>\n<p>Now that we have our first <strong>Dimension<\/strong> and <strong>Measure<\/strong> let&#8217;s go ahead and see what we can achieve using OLAP.<\/p>\n<p>We would like to structure the data so that we can have an overview of the number of employees grouped by Gender and Positon.\u00a0Thus, we can answer questions like:<\/p>\n<ul>\n<li>How many male Store Managers have been in touch with the Carreer guidance unit?<\/li>\n<\/ul>\n<p>To achieve this we drag the Measure,\u00a0<strong>Count,<\/strong> into the canvas. We do the same for the Dimensions: <strong>Positions Title<\/strong> and <strong>Gender<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<p>The data is retrieved from the table Employee and grouped by the Attributes\u00a0<strong>Gender<\/strong> and <strong>Position Title<\/strong>. Now we have the number of Men and Women and their respective\u00a0roles within the organisation, all of whom\u00a0have visited our career guidance unit.<\/p>\n<p>&nbsp;<\/p>\n<p>As you can see it is really easy to handle your data and structure it as you find fit. Let\u2019s go ahead and add more attributes from the Dimension E<strong>mployee<\/strong> to the grid. We are going to drag the <strong>Education Level<\/strong> and the <strong>Marital Status<\/strong> (M=Married, S=Single) to the grid. The grid and chart generated gives a nice summary of the profile of the employees, so we can start to build a basic idea of the profile of our potential &#8220;customers.&#8221;\u00a0\u00a0The result can be seen in the image below.<\/p>\n<p>&nbsp;<\/p>\n<p>We can easily swap the axes to represent the same grid from another perspective.<\/p>\n<p>&nbsp;<\/p>\n<p>After adding the <strong>Pay Type<\/strong> and <strong>Management Role<\/strong> to the grid we can start to see some tendencies in our data. For instance there are twice as many female, married\u00a0VP Country Managers as their male counterparts\u00a0who\u00a0visited our career guidance unit. Thus, it&#8217;s obvious that\u00a0before making conclusions, one would need to account for this finding.<\/p>\n<p>&nbsp;<\/p>\n<p>At this point it would be interesting to know how tendencies evolve over\u00a0time. We will need to create the Dimension <strong>Time<\/strong>, which requires\u00a0us\u00a0to translate the Date field into Years, Quarters, Months and Days. We could create separate columns based on the meeting date.<\/p>\n<p><span style=\"color: #3366ff;\">\u00a0\u00a0\u00a0 <\/span><span style=\"color: #0000ff;\">TO_CHAR<\/span> (meeting.date, <span style=\"color: #ff0000;\">&#8216;YYYY&#8217;<\/span>) <span style=\"color: #0000ff;\">AS<\/span> year,<\/p>\n<p><span style=\"color: #0000ff;\">\u00a0\u00a0\u00a0 TO_CHAR<\/span> (meeting.date, <span style=\"color: #ff0000;\">&#8216;MM&#8217;<\/span>) <span style=\"color: #0000ff;\">AS<\/span> month,<\/p>\n<p><span style=\"color: #0000ff;\"> TO_CHAR<\/span> (meeting.date, <span style=\"color: #ff0000;\">&#8216;DD&#8217;<\/span>) <span style=\"color: #0000ff;\">AS<\/span> day,<\/p>\n<p><span style=\"color: #0000ff;\"> TO_CHAR<\/span> (meeting.date, <span style=\"color: #ff0000;\">&#8216;W&#8217;<\/span>) <span style=\"color: #0000ff;\">AS<\/span> week,<\/p>\n<p>Another way is to create a separate table holding the fields time dimension attributes (Year, Quarter, Month) and reference them using the &#8220;<span style=\"color: #008000;\">time_id<\/span>&#8220;.<\/p>\n<p><span style=\"color: #a52a2a;\"><strong>CREATE<\/strong><\/span> CACHED <span style=\"color: #a52a2a;\"><strong>TABLE<\/strong><\/span> PUBLIC.&#8221;<span style=\"color: #008000;\">time_by_day<\/span>&#8220;(<\/p>\n<p><span style=\"color: #008000;\">&#8220;time_id&#8221;<\/span> <span style=\"color: #0000ff;\"><strong>INT<\/strong><\/span> <span style=\"color: #a52a2a;\"><strong>NOT<\/strong><\/span> <span style=\"color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;the_date&#8221;<\/span> DATETIME <span style=\"color: #a52a2a;\"><strong>DEFAULT<\/strong><\/span> <span style=\"color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;the_day&#8221;<\/span> <span style=\"color: #a52a2a;\"><strong>VARCHAR<\/strong><\/span>(30) <span style=\"color: #a52a2a;\"><strong>DEFAULT<\/strong><\/span> <span style=\"color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;the_month&#8221;<\/span> <span style=\"color: #a52a2a;\"><strong>VARCHAR<\/strong><\/span>(30) <span style=\"color: #a52a2a;\"><strong>DEFAULT<\/strong><\/span> <span style=\"color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;the_year&#8221;<\/span> <span style=\"color: #0000ff;\"><strong>SMALLINT<\/strong><\/span><span style=\"color: #a52a2a;\"> <strong>DEFAULT<\/strong><\/span> <span style=\"color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;day_of_month&#8221;<\/span> <span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>SMALLINT<\/strong><\/span><span style=\"line-height: 20.7999992370605px; color: #a52a2a;\">\u00a0<strong>DEFAULT<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;week_of_year&#8221;<\/span> <span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>INT<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><strong style=\"color: #a52a2a; line-height: 20.7999992370605px;\">DEFAULT<\/strong><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;month_of_year&#8221;<\/span> <span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>SMALLINT<\/strong><\/span><span style=\"line-height: 20.7999992370605px; color: #a52a2a;\">\u00a0<strong>DEFAULT<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;quarter&#8221;<\/span> <span style=\"line-height: 20.7999992370605px; color: #a52a2a;\"><strong>VARCHAR<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">(30)\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #a52a2a;\"><strong>DEFAULT<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>NULL<\/strong><\/span>,<\/p>\n<p><span style=\"color: #008000;\">&#8220;fiscal_period&#8221;<\/span> <span style=\"line-height: 20.7999992370605px; color: #a52a2a;\"><strong>VARCHAR<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">(30)\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #a52a2a;\"><strong>DEFAULT<\/strong><\/span><span style=\"line-height: 20.7999992370605px;\">\u00a0<\/span><span style=\"line-height: 20.7999992370605px; color: #0000ff;\"><strong>NULL<\/strong><\/span>);<\/p>\n<p>&nbsp;<\/p>\n<p>We can now add the Dimension <b>Time<\/b>\u00a0in our HR Cube. We reference to the time_by_day table with type=\u2019TIME\u2019 and attributes Year, Quarter and so on.<\/p>\n<p>&nbsp;<\/p>\n<p>The Dimension <strong>Time<\/strong> is represented on the panel as such:<\/p>\n<p>&nbsp;<\/p>\n<p>Let\u2019s group the grid by year and quarters. For the sake of simplicity the image shows only Q1, Q2, and year 1998.<\/p>\n<p>&nbsp;<\/p>\n<h3>Conclusion<\/h3>\n<p>What have we achieved so far?<\/p>\n<p>We created the OLAP HR Cube that holds the dimensions and measures needed to create a 360\u00b0 profile of employees needing career guidance. We are now able to create reports and dashboards using drag and drop. We are able to structure the data and discover tendencies. We can generate reports on the fly that are easily readable by the business.\u00a0In the next blog post, I will explain how to make predictions based on the 360\u00b0 profile we just created.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>An introduction to the role of analytics in HR departments along with some examples of analytics and data mining in the area.&nbsp;<\/p>\n","protected":false},"author":17,"featured_media":23863,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-post-2.php","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[153],"ppma_author":[2427],"class_list":["post-1097","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-hr-analytics"],"authors":[{"term_id":2427,"user_id":17,"is_guest":0,"slug":"ivan-jacobs","display_name":"Ivan Jacobs","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Jacobs","first_name":"Ivan","job_title":"","description":"Ivan has over 15 years of experience in software development. He has been working with the European Commission and the European Parliament for the past 9 years and has extensive experience in the areas of Machine Learning, Oracle Data Mining Packages, Spark, Hadoop, Opencv, Openimaj, Spring source, Map Reduce, NoSQL and Relational Databases. He is based in Belgium."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1097","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1097"}],"version-history":[{"count":5,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1097\/revisions"}],"predecessor-version":[{"id":30027,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1097\/revisions\/30027"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/23863"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1097"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1097"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1097"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1097"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}