On Individual Performance Metrics

My previous entry was all about how individual performance metrics in a collaborative, agile environment are misguided. But to keep things focused on reality, where demands for such metrics are sometimes unavoidable, I ended on a challenge for myself to come up with some individual, quantitative performance metrics that did not conflict with notions of self-organization, trust, collaboration, teamwork, and all that stuff that makes agile development teams happy and productive.

While driving home from work I came up with a couple of ideas.

To begin, I’ll jot down some attributes of what I would consider an acceptable individual, quantitatively measurable performance metric.

  1. [ ]  it is not adversely affected by taking time to help others
  2. [ ]  it is not adversely affected by doing high priority work over high complexity work
  3. [ ]  it does not discourage doing high complexity work for fear of failure
  4. [ ]  it makes individual praise sensible and individual punishment nonsensical
  5. [ ]  the closer to an ideal value it is, the more successful the team at delivering valuable work
  6. [ ]  it does not loom over people’s heads and lower morale

Let’s say that if an idea ticks all six of the boxen, then it’s fully acceptable. If it doesn’t, then it’s to be avoided. With that in mind, let’s look at a handful of “candidate” metrics.

Individual velocity
I ranted about this at length in my previous entry. This metric fails points 1, 2, 3, and 6, and makes a weak case for passing 5. It incentivizes selfishness and compromises team collaboration.

Defect count per person
Whether used to evaluate developers based on how few defects are discovered in their work or to evaluate testers based on how many defects they dicover in others’ work, this metric fails points 3 and 6. In the case of evaluating testers for, effectively, reporting how crappy the developers’ code is, it also fails points 1 and 5.

Number of lines of code

Ok on to the ideas that I think might at least be conversation starters…

Contributions to an “agility index”
This one came to me while reflecting on a conversation featuring Ken Schwaber and the notion of having an Agility Index that would indicate how “agile” a company is. From what I could tell, it amounted to a checklist of agility-enabling practices that in turn yielded a score between 0 and 100. While I can’t seem to find this particular checklist on Scrum.org — presumably they’d want to sell its usage — I think a comparable checklist can be formulated by anyone with sufficient experience. The more items checked off, the higher the score on the “agility index.”

Making progress towards a higher index benefits the entire team and should not compromise any other good practices. In other words, all six requirements mentioned above would be met. What’s left is figuring out how to make it a quantifiable individual measurement. Well, when it comes time for a performance evaluation, simply ask each developer to mark the “agility index” checklist items that he or she contributed to, with a short blurb specifying the nature of the contribution. The more items contributed to, the better. And the nature of the items should ensure that nothing beneficial got compromised.

Kudos received through team retrospectives
This one is a bit wacky, but I think it’s worth a go. Some variations on team retrospectives include “giving kudos” to fellow team members. These could be for help offered, or ideas presented, or the quality of some deliverable, or really anything at all that was appreciated by others on the team. By being all about collaboration and contribution to team success, this approach ticks all six of the above boxes. And if a process or project manager keeps track of the number of “kudos” each team member receives, that can later be turned into a performance indicator.

So, that’s what I came up with on my drive home. There may be other things of this nature that focus much more on success than failure, and team success at that, all the while allowing for an individual perspective. And I think these are the kinds of metrics we should focus on if we absolutely have to. If it were up to me, I’d just focus on enabling team success.

Don’t Measure Individuals

I recently started looking at some project management software called AtTask, evaluating whether it is appropriate for agile development. While it seemed to be quite capable as a “waterfall” PM tool I wasn’t thrilled by its take on “Agile” (yes, the AtTask people use that word as a proper noun, which annoys me but I’ll write on that another time). I brought up AtTask’s inadequacies to the client, briefly mentioning my recommendation to go with Greenhopper… oh wait Atlassian renamed it… to JIRA Agile…  dang they’re also using it as a proper noun).

Anyway, what project management tool ends up being used remains to be seen. But what the client impressed on me is that the software used should allow for measuring individual velocity. (By velocity I am referring to “points of complexity” delivered per iteration.)


Individual velocity?

Folks, if someone says that they want to quantitatively measure individual performance in an environment that’s supporting teams, just say no. That idea is bad in so many ways that I will actually enumerate some of them.

1. It is a conceptual non-starter
What does individual velocity mean when you are talking about a software development team? In the environment in question, each deliverable will be handled by at least three people — a primary developer, a developer who provides peer review, and a dedicated tester. In many cases there will be even more team members involved. So whose “velocity” is at stake when a deliverable isn’t done at the end of an iteration? If a developer hands off a bunch of stuff to testers, does that developer have a high velocity? If those testers find a million defects, is that a low velocity for the developer and a high one for the testers? If a technical lead who is needed for peer review is at a conference for a few days and some deliverables don’t make progress, does the primary developer’s velocity take a hit? Nothing about this concept makes sense to me.

Software is a team effort. Isolating each team member’s individual effort in delivering points of complexity is a suspect task. Just as an example, something like sending an email can be easy enough to implement but a pain in the arse to write automated tests for. Does Bob the tester get more credit than Mary the developer? Perhaps Bob got a whole bunch of assistance from Vick the technical lead. Should Vick get in on some of that sweet velocity loot? There is a team velocity because the team as a unit delivers software.

2. It is impossible to actually do
Besides the troubles listed above, velocity, like all such metrics, can be gamed. And I want to be very clear when I say that it will be gamed, at the expense of the team and the product.

Suppose that not all work has “points of complexity” (sometimes only business deliverables get this attribute, while other technical and non-technical tasks do not; the latter is then seen as helping get the former “done”). Presumably people that work on tasks that don’t have points won’t be looked at negatively when they don’t “take points across the wall.” So, let’s say Sam is a below-average developer and is having trouble writing as much production-ready code as some of his team members. Sam could just take on tasks that don’t have any points, so as he finishes them at his comparatively slow pace, he doesn’t have to worry about being compared to his colleagues. Sam makes himself difficult to measure.

Alternately, Sam may avoid doing any work that doesn’t have story points (e.g. setting up continuous integration) so that he can maximize how many points he delivers, leaving the “non-measured” work to the rest of the team.

Or, even worse, Sam just decides to forego any notion of maintainability and cuts corners like he’s got a Super Star in Mario Kart. His points delivered for the iteration go up. But the technical debt incurred from his shenanigans raises the complexity of future work. Team velocity suffers down the road.

But come on, Sam wouldn’t do that. Sam is a good team member who knows what it means to write quality code. He’s just inexperienced and needs a bit more guidance than his peers. But when he asks for help, Jack and Satou tell him “ain’t nobody got time for that” because they’ve got their own points to worry about!

Satou would never say something like that, though. He’s from Rhode Island. Also, he would help Sam out, and in turn his own work would stall while Sam’s moved along. This might be good for the team (and the business), but it is at Satou’s expense. That is, unless Satou also “took some credit,” which would only be fair, right?

At this point, we’ve really lost all sense of a metric for individual performance. Who did “more” work, Satou or Sam? Who did more important work? What about Jack — if part of Jack’s value to the team is the knowledge he is able to share, then does his refusal to help Sam reflect on his performance and how does it weigh against the “points” he was able to deliver?

“Velocity” on an individual level is a metric that’s vulnerable to so many deliberate and non-deliberate breakages that it’s effectively void. And hopefully it’s clear that the actual numbers that you’d come up when trying to measure “individual velocity” in a real situation would be very hard to distinguish from some you’d get by rolling dice. There is nothing to ensure that they reflect anyone’s skill level or actual productivity. And there are still other reasons not to go down this dark path.

3. It shifts focus to all the wrong things
When a business asks for software, what is ultimately promised by the team tasked to deliver the software? That the right functionality and quality will be implemented for a reasonable cost? Or that each team member will perform adequately in accordance with some performance metric? I’m guessing the first one.

When a team is focused solely on writing valuable, high-quality software, they will assist each other as needed and avoid compromising their goal for no good reason.

But I submit that reward for visible high contribution and/or punishment for visible low contribution can be quite compelling reasons indeed. When one is incentivized to look better than one’s peers (or to at least not look worse), then a conflict of interest arises where the actual quality of the software competes with the perceived quality of one’s individual contribution.

4. It lowers morale
Individual performance metrics are stressors as well as a potential source of tension and discord among team members. Rather than emphasizing success and movement in a positive direction; rather than encouraging collaboration and teamwork; rather than fostering a feeling of joint ownership, they introduce the fear of punishment for failure; they discourage altruism, knowledge sharing, and generally working together; they incentivize people to mask their inexperience. They can single-handedly make an otherwise positive experience into a negative one. Developers can become less happy. And when morale is low, so is productivity.

5. It is a net value loss
I suppose I should address the elephant in the room at this point, so here we go…

Why would anyone want individual performance metrics? Is it to give everyone cookies and donuts and Clif bars relative to how awesome they are? Probably not; it’s much more likely an attempt to target the underperformers. It’s a gathering of “objective evidence” that people that you already perceive to totally suck in fact do.

I have yet to see any other reason put forth that makes sense. If you want to reward people for good performance, nobody is going to challenge you for “proof.” If you want to manage resources in such a way that teams are balanced in skill and capability then you can do better than rely on fuzzy math to do it.

So this endeavor adds rather little value and carries a rather high cost. As mentioned, people will game the system, focusing on perception at the expense of ultimate quality. There’s the problem of lower morale and in turn lower productivity. These result in higher costs for the business to get what it needs. And the supposed value-add? The “addition by subtraction” of removing an underperforming team member? It’s far from a guarantee, not least because the system is vulnerable to gaming from all angles.

So what happens is the person who you think totally sucks merely continues to totally suck except now you’ve introduced a whole bunch more problems to worry about in terms of damaging team dynamics.

6. It goes against the principle of empowered, self-organizing teams
If a team is entrusted with delivering software then why should that team be burdened with a handicap like “individual velocity” just for the sake of gathering evidence against “bad” developers? Let the team figure out how to deliver the best software it can, let people collaborate as they see fit, and if the team decides a member is having a negative effect then trust the team to make that decision. (Naturally, asking for proof in the form of some numbers can take you down a very ugly path of infighting, subterfuge, and sabotage as people try to game a flawed system in conflicting ways. So don’t do it.) Find someone empowered to manage team personnel and remove the problem member if the team deems it necessary.

To conclude, individual performance metrics look to be a terribly unproductive endeavor at best and a highly damaging one at worst. Development teams, especially ones that have a good level of transparency built into their approach, already have no secrets about who’s good and who sucks. Efforts can be made to let team members help and improve each other and remove negative members if necessary, or efforts can be made to undermine what a productive dev team should be all about. Don’t fall into a trap of going for the latter.

As an afterword, suppose you absolutely have to obtain some “quantitative” measurement for individual performance reviews due to some stinky contract that was signed eons ago when software was written by fish. The challenge is to come up with metrics that do not compromise the principles of agility, trust, and self-organization that are worth so much to a dev shop — metrics that don’t introduce a conflict of interest. This is actually a bit of a puzzle and I will think on it some. I’ll post my thoughts in my next entry.