For those seeking to use machine learning to power valuable artificial intelligence solutions, an article by Debo Olaosebikan has nuggets of pure gold. Those who merely want to use the words Machine Learning and Artificial intelligence to inflate their sad little wrinkled balloon can stop here.

The article is titled, Artificial Intelligence Predictions for the Year 2019. This post draws attention to one of those predictions that highlights a critical obstacle to achieving liftoff for many machine learning efforts — the lack of large numbers of labeled cases needed for effective algorithm training.

Objectively assessing talent to improve hiring and performance development outcomes drives my interest in deep learning analytics. Subjective hiring and development decisions still rule the day. You know the ones. Quick resume sorts and ten minute phone interviews for screening. Warm smile, gut check interviews for hiring. Picking your friends or squeaky wheels for development.

Subjective talent decision making is going away for two reasons:

  1. High decision failure rates
  2. Avoidable labor cost for recruiters/HR and high opportunity costs born by hiring managers distracted from managing the business.


The failure rate for subjective hiring decisions hovers between 30-50% over from 3 to 18 months, depending on the knowledge and skill level required to do the job. That’s a very long way from the hyped six sigma failure rate targeted for production processes. Purely objective talent decision guidance used to mean multiple item ability tests, personality tests, and work samples.

It turns out that it is pretty easy (and very profitable) to gin up a few test items on the back of a napkin, slap the simply-calculated test scores into a colorful report, and call it an ‘assessment experience.’ ‘Test’ is so last century. Say mostly nice things about everybody phrased to tap into the latest management fad and you have a winner — for the test owner and distributor. For the companies that buy them and the candidates that suffer the mind-numbing multiple choice experience (not to mention the near random accuracy) — not so much.


2016-12-09 hrexaminer eab bio photo tom janz full.jpg

Tom Janz, HRExaminer.com Editorial Advisory Board Contributor

Creating and validating predictive assessments is a lot more work, so it happens much less often, but it does happen. Even then, valid comprehensive assessments with cognitive ability and personality factor components involve upwards of 100 test items taking 60 or more minutes to complete. While testing used to take place in proctored testing centers, times have changed. Now tests are mostly taken online in the comfort of your own cheating manual. Oops. There is a movement towards high-fidelity simulations with the engaging character of a game. Done well, that vastly reduces the risk of talented candidates dropping out in favor or employers who demand less of them. And the risk of respondent dropout may be less than feared, according to one large field research study. Still, traditional validated tests take time and cost to develop. They require candidates to spend time away from doing something fun for each job opening they target. Game-like assessments are more engaging, but cost a lot more to develop and still take about the same amount of candidate time.

So where does this all lead and how is it related to deep learning and labeled cases? Now, the patient are rewarded with answers. I have previously addressed how assessment is headed towards a zero length candidate experience. Others have too. It is a growing trend. One approach draws on visual simplicity — Traitify. Algorithms based on deep learning analytics applied to publicly available social media text (DeepSense) have delivered substantial correlations with widely used assessment tests. The first exploratory look at correlations between machine-originated scores and independent external estimates was based on only 10 persons rated by 3-4 people that knew them — far too small a test sample to draw reliable conclusions.

The next step along the research path boosted the test sample to 56 cases. Now it was possible to confidently conclude that social media analytics scores from DeepSense correlated with external judgement on the DISC factors in the .50s and .60s. Continuing along that path, I have now assembled a data set of 286 academics and professionals from my network that have been analyzed by DeepSense. I am currently in the process of completing a blind professional evaluation of the data set members on the seven performance factors, the big five personality factors, and four DISC factors. In addition, a subset of the 286 are being invited to provide their own self-evaluation on the big five personality factors via a quick survey.

With a data set of 286, still hardly “big data” it is possible to norm the machine-generated personal factor scores to yield a “PersonaPROFILE.” Ultimately, the profile can be summed up by adding the 16 components described above into a total score. Reviewing the factor-total correlations revealed that some of the components correlated negatively with the total. For example, Need for Autonomy correlated -.48 with the total. People too high on need for autonomy want to run their own show, making them a bad bet for fitting in within a corporate structure. Action Orientation correlated -.70 with the total. People too high on action orientation tend to go off and do whatever they want, seeking forgiveness later instead of permission in advance. Again, lots of feather ruffling happening around them. Talent that gets things done inside of a corporate structure makes the time investment to inform and engage others so they feel involved instead of left out.

Similarly, the DISC scale of Dominance correlated -.82 with the total. People who are too dominant may show great aggressiveness and tenacity, needed for turnarounds and perhaps startups, but make a lot of waves that rock all boats inside of corporations. The solution to these ‘reversals’ is simply to ‘reverse score’ those factors before adding them into the total. The calculation of total score is based on positive and negative weights that optimize for Talent-Centric Leadership — the essence of the overall score.

Ranking the 286 members of the data set produced an interesting leader-board, to put it mildly. Some of the members I don’t know personally, but know of. Here is a sampling. Remember, this is all based on information these people mounted publicly. The number is the person’s rank out of 286.

1 Maren Hogan, CEO of Red Branch Media | 2 Gerry Crispin, Principal and Founder, Career Crossroads | 3 Dr. Marshall Goldsmith, #1 Executive Coach and Leadership Thinker (he actually is) | 4 Lou Adler, CEO, Performance Based Hiring Systems, | 5 John Sumser, Principal Analyst at HRExaminer, | 6 Pat Sharp, Founder for Tournovate Big, | 10 Andrew Gadomski, Managing Director, Aspen Analytics, | 16 Nir Eyal, Author “Hooked: How to build habit forming products”, | 25 Laszlo Bock, CEO and Co-founder, Humu, | 31 Daymond John, CEO Fubu and The Shark Group, | 37 Kevin Wheeler, Chairman, The Future of Talent Institute and CEO Global Learning Resources.

While tempted to show data set members from the bottom 10 percent to prove a point, I would accumulate too many lawyer jokes. Suffice it to say that in my case, a little digging confirmed that most deserve to be there, some in spite of much wealth and fame. And there are some ‘misses.’ Psychometric measurement is far from perfect and analytics performed on scraped social media relies on the target person having a social media activity to analyze. Some of the people at the bottom of the Talent Centric Leadership list would score much higher on a Turnaround Leader or Big Ticket Sales Leader listing. Interestingly, the number two member, when told who was number one said to me, “Well that makes sense.” People who score lower on the list devote more of their “first reaction” comments to how little they use social media. Maybe true, but they sure put a lot of fire into finding some other reason for their low standing.


I ranked 66th out of 286 and am reasonably OK with that, considering all the people I out-ranked. Since no one wants to stare at tables of numbers or read research articles, an info-graphic was created to color-code normative feedback on the 15 persona factors. Here’s my PersonaGRAM. The dark green area labeled Leading Strengths contains factors where I scored better than 80% of the norm group. The next shade of green down contains Solid Plus factors where my percentile standing fell between 70 and 80. Next, a mere “Plus” falls between 55 and 70. The grey Neutral Zone covers 55 to 45. The Slight Concern zone covers from 44 to 36. Below 36 percentile lands you in the deep orange ‘Concern’ zone. For people in great shape, it highlights their true relative strengths and areas of possible concern. For people who lack any ‘Leading Strengths,’ awash in orange and grey, it’s about as much fun as looking in the mirror when you are overweight. The sting of reason, the splash of tears. As the alternative, how is flattery and self deception working for ya? After all, you probably don’t use social media that much and some of it you have other people write anyways.

Stay tuned for ongoing research featuring much larger data sets trained on many more labeled cases and then applied to a wide variety of uses.


Read previous post:
HRExaminer v10.08

“We’re so excited about the future of technology but we can’t even get back to job candidates in a timely...