Testing in the Modern World: November 2007

Monday, November 26, 2007

Should testers write "unit tests"?

"Some software is very difficult to test manually. In these cases, we are often forced into writing test programs."

A few weeks ago, one of the devs on my team went to the Patterns and Practices Summit where he heard James Newkirk say that testers should write unit tests. For some reason writing unit tests is a very seductive idea to me (because I enjoy it?), but I have a gut feel that testers shouldn't write them. I worry that it shifts a responsibility from the devs to the testers that the devs should reserve. Shouldn't the author of the code unit test it?

At PNSQC, Michael Bolton said that unit testing is "confirmatory testing". It is testing that is looking to prove that things work like the author intended and everything is fine. A unit test author doesn't want to see things go wrong. They want to see things go right, keep that bar green! A tester would want to see a red bar. This positive attitude is fine, as long as those involved are aware of this dynamic.

In the meeting I spoke up and said I wasn't sure about testers writing unit tests. The dev manager then described what I would call a functional test and said he thought testers writing this type of test would be fine.

Right away, we run into terminology issues which make taking about "unit testing" harder than it might otherwise be.

What is the definition of a "unit test"?

*A test that verifies the behavior of some small part of the system?

*A developer or programmer test. Contrast with a customer test.

"In computer programming, unit testing is a procedure used to validate that individual units of source code are working properly. A unit is the smallest testable part of an application."

Tests that the devs write such that the "...units being tested are a consequence of the design of the software, rather than being a direct translation of the requirements."

Now I'm confused. Since I'm not totally sure what a unit test is, how do I know that testers shouldn't write them? Hmm.

Well, I think of classic unit tests where the author writes some code, has some methods, and if it is OO software, some classes as a result, and then writes standard tests for those. Good solid tests like invalid values, boundary tests, that kind of thing. This seems like stuff the dev should write.

Stepping away from this, what about when you start to combine some subsystems and you are analyzing larger functionality that might be recognizable to a customer as at least a small part of the functionality they are expecting the software to provide to them? This starts to seem like it would be fair game for the testers to write this type of test. Are these unit tests? I don't think they are, at least not against what I think is the current understanding of unit tests. Maybe where SDETs are common, this type of boundary is understood better and they don't worry about it. Elsewhere, I think it is a gap that might often be missed.

Monday, November 19, 2007

Code reviews vs. Testing

On the Coding Horror blog today Jeff reminded us about these statistics from an earlier post of his regarding the gains from code reviews:

But don't take my word for it. McConnell provides plenty of evidence for the efficacy of code reviews in Code Complete:

.. software testing alone has limited effectiveness -- the average defect detection rate is only 25 percent for unit testing, 35 percent for function testing, and 45 percent for integration testing. In contrast, the average effectiveness of design and code inspections are 55 and 60 percent. Case studies of review results have been impressive:

* In a software-maintenance organization, 55 percent of one-line maintenance changes were in error before code reviews were introduced. After reviews were introduced, only 2 percent of the changes were in error. When all changes were considered, 95 percent were correct the first time after reviews were introduced. Before reviews were introduced, under 20 percent were correct the first time.

* In a group of 11 programs developed by the same group of people, the first 5 were developed without reviews. The remaining 6 were developed with reviews. After all the programs were released to production, the first 5 had an average of 4.5 errors per 100 lines of code. The 6 that had been inspected had an average of only 0.82 errors per 100. Reviews cut the errors by over 80 percent.

* The Aetna Insurance Company found 82 percent of the errors in a program by using inspections and was able to decrease its development resources by 20 percent.

* IBM's 500,000 line Orbit project used 11 levels of inspections. It was delivered early and had only about 1 percent of the errors that would normally be expected.

* A study of an organization at AT&T with more than 200 people reported a 14 percent increase in productivity and a 90 percent decrease in defects after the organization introduced reviews.

* Jet Propulsion Laboratories estimates that it saves about $25,000 per inspection by finding and fixing defects at an early stage.

Every time I see these statistics I get a little nervous because it seems like someone is trying to make an argument that tries to diminish the contribution testers make to a software project.

"Darn, we don't do code reviews, so I guess we have to have testers..."

But then I remind myself the issues are deeper and things are never as easy as they sound.

Jeff's previous post and the current one, make it sound like you just snap your fingers, start doing code reviews, and your problems are gone.

Problem is, reviewing code well is a hard skill to possess. Instead of the 80/20 rule it seems like the 20/80 rule: you spend all your time (however much you have) on code reviews finding the easy 20% of the bugs you could find while the other 80% lay dormant.

Part of what bugs me is the statistics and when they were taken. Different times, teams, languages, projects than yours. This doesn't wipe out the data, I still believe it, but it doesn't translate straight across either.

They say it takes 10,000 hours to become an expert at any particular thing. That works out to three hours a day for 10 years. How long have you spent doing detailed, thoughtful, deep code reviews? Here is the start of the problem.

I believe in code reviews and was very enthusiastic about them on a previous job. I worked really hard and got our team to start doing them. Then, ran into the problem that no one had developed the skill to get anywhere near the results of the statistics. And no one was interested in working very hard to improve their skills. Our code review experiment died a quiet death.

I do like being reminded of these statistics. Which ends up reminding me code review is a hard skill to learn and we all probably need more practice.

Toward that end, Scott Hanselman has a series on quality code to read. Worth checking out as a start.

Monday, November 12, 2007

Can I see you do that?

I wouldn't have an interview process that didn't include an audition. I've done about ten tester interviews with auditions now, and it tells you a lot about the candidate. You really see how they think.

For the audition, choose a small program to test. You can write it yourself or use something already out there. It doesn't have to be elaborate. You might add a pretend spec or readme.txt alongside the program.

The most interesting thing is how many different approaches people take testing the program. If ever you needed a rationale for having a team of testers, this is it. Get several people looking at software and you'll get some very different approaches. Some people dig into the spec and take it as the only truth. What it explicitly states is law, and what it doesn't mention is forbidden or unimportant. Others dive in and start typing in values for the sides of triangles. Some try special characters. Everyone does a lot of boundary testing. Some have a UI testing eye and make notes of proper tab order or improper resizing. None have run all the test Meyers notes in his book.

One of the best parts is at the end when I ask how much of the program they have tested on a scale of 0 to 100. Candidates have taken twenty to forty five minutes to test the program. I've gotten answers ranging from "10" to "90" (not correlated with longer or shorter times testing).

There is a lot of talk about interviewing and how imprecise it is. I don't subscribe to the "Blink interviewing hypothesis" entirely, having seen what an audition can show. While an audition won't tell you how hard the candidate will work, it does give you an idea of where they are at. Are they too trusting of authority, be it a spec or a developer? Do they run down all the leads they find while testing or stop at the first bug found?

At the PNSQC conference this year, Jon Bach gave a great presentation about traps testers fall into. In his presentation Jon lists the "Top Ten Tendencies that Trap Testers":

#10 Stakeholder Trust
#9 Compartmental Thinking
#8 Definition Faith
#7 Inattentional Blindness
#6 Dismissed Confusion
#5 Performance Paralysis
#4 Function Fanaticism
#3 Yourself, Untested
#2 Bad Oracles
#1 Premature Celebration

These are really good traps to consider when evaluating candidates or yourself.

Sunday, November 4, 2007

Test plans?

I'm still not sure how I feel about test plans. As with all documents that help you think, the thinking part is great. I'm not sure how much value the plan as a document has once you are over the first thinking hurdle.

Of course, you should keep it a living document, but at what point does the maintenance cost exceed the remaining value of the plan? It's great to get you going, but at some point you know exactly what you need to do and the plan isn't necessary any longer. I've seen problems where this cutoff point wasn't respected and time was spent on the plan that would better have gone to testing.

I've had a lot of freedom in the test plans I've written. I've followed the IEEE template and I've made up numerous ones of my own. They all work if they get you to ask the right questions.

What exactly are we testing?
When do we expect the testing to happen?
What do we expect the testing to show us?
What are we going to consciously ignore because of project constraints?

Some of my development managers didn't care if I wrote a plan or not as long as the right testing got done. Others seemed to want the plan more than the testing. Biggest mistakes I've made are when I didn't realize what kind of manager I was dealing with in this regard. To some development managers, testing is a mystery and a plan removes some of that mystery. To others it is a false security blanket they'll later turn into a hammer and beat you with when you "missed" a bug that you should have "obviously" caught. No mention is made of the developers who coded the bug of course.

Maybe I know a little better what I think of test plans after this post. They are complicated and can get you into trouble if you aren't careful. If they ask the right questions they earn their keep, but don't let them eat into time better spent testing your software.

Testing in the Modern World