Performance appraisals are a common feature of talent management programs within organisations, used to identify strengths and development areas and as inputs into decisions about pay, promotions and terminations. However, there is a wealth of management literature that criticises the ability of performance appraisals to be unbiased, fair and accurate.
One of the biggest challenges with ensuring performance appraisals are an effective part of the talent management process is how to eliminate, or at least minimise, the presence of multiple biases. Some of the biases that can be present include (but are certainly not limited to):
- Confirmation bias: raters look for and emphasise data that supports an existing opinion or idea
- Fundamental attribution error: more emphasis is placed on personality based explanations for behaviour, and contextual or situational explanations are under emphasised
- Recency bias: raters give more weight to recent events
- Recall bias: performance is justified after the fact, and can be distorted by memory
- Halos and Horms: the review is dominated by a positive or negative view of the employee.
A number of past studies have concluded that variance in performance ratings is more associated with the psychology and biases of the rater rather than the actual performance of the ratee (McGregor 1985; Scullen, Mount and Goff, 2000; Ng, Koh, Ang, Kennedy and Chan, 2011 as cited in Stepanovich 2013).
The challenge is to make the intellectual “real”. An immersion experience may provide raters with insights and the motivation to address bias, but the elimination of bias is no simple task. It requires more than just training those conducting performance appraisals, in fact it requires changing social and political contexts (Poon 2002 as cited in Stepanovich 2013).
This pedagogical exercise was intended to provide students with firsthand experience of the impact bias and chance can have on performance appraisals.
The exercise was based on simulating a performance appraisal process of five Analysts via a forced ranking exercise. One student played the role of the Vice President, while another student played the role of the Analysts under review. The VP was presented with a range of information points gathered over a three year period regarding the Analyst they were asked to review. The data included appraisals from a range of supervisors for the Analyst, educational details, a note from HR and evaluation scores. All of the data was, of course, fictitious, and each Analyst was given the same evaluation score (5/10), although the data leading to that score was different for each Analyst, for example some “supervisors” made comment about behaviour during a previous rating period (representing halo or horns bias), or attributed system failures to the Analyst (attribution error), or demonstrated variations in their ratings (rater variability). The VP interviewed each of the analysts and then decided whether the Analyst should be promoted (it was a given that one had to be promoted), terminated (it was a given that one had to be terminated), or given a salary increase. The exercise was conducted in a class of 28 students, so there were four groups of VPs and Analysts participating. All groups had the same Analyst profiles.
The core finding of the exercise was that, although all Analysts had the same performance scores, the presence of bias and chance meant the performance appraisals, in fact, varied between the groups. This means that the VP decision did not reflect an Analysts’ capability, but reflected more upon the VP’s bias, or their adoption of the supervisor’s biased reports. At heart, the decision about who to promote, fire or reward with a salary, was prone to bias, and this was amplified across the groups in relation to gender given the Analyst ‘Mary’ was the most likely to be fired due to the presence of gender bias and attribution error in the supervisor’s ratings of her performance.
Performance appraisals are likely to be inherently subject to any number and combination of biases, some harder to identify and control for than others. If not accounted for, these biases have the potential to significantly skew the results of performance ratings and lead to less than ideal decisions being made about talent. A talented employee who may appear to be underperforming could be lost, or a lack lustre employee could be promoted in favour of someone with more potential. It is not suggested that performance appraisals should be removed entirely as a performance management tool, nor is this likely to happen in reality. However, organisations have an obligation to themselves and their talented employees to acknowledge the potential existence of biases and put in place frameworks, tools and strategies to minimise and account for them. In doing so, there can be confidence in the ability of performance appraisals to provide accurate and reliable insights into talent management decisions.
Actions that organisations can take to minimise the impact of biases in performance appraisal processes include:
- Combining self, peer and superior evaluations to obtain a broader spectrum of data points
- Avoiding scheduling evaluation interviews one after the other to minimise contrast bias
- Encouraging more frequent feedback and utilising technologies that support regular tracking and journaling of performance throughout the year to avoid recall bias and halo/horn effects,
- Requesting examples to back up ratings, and in particular examples within the rating period
- Calibrating ratings across raters to avoid the impact of an individual rater being more lenient/strict
- Seeking facts rather than inferences to substantiate ratings
- Developing guidelines for those involved in making performance decisions that identify and explain the different types of biases that can occur, and how they can be avoided.