Cyclomatic Complexity and Unit Testing
I follow both Andrew Binstock and Andy Glover from Stelligent and have much respect for them both and would like to add to a blog that was posted.
I like their approach to using metrics to figure out the number of unit tests needed for a method. Actually, metrics should be used to determine the minimal number of unit tests. Cyclomatic complexity (CC) can be used for this, but so can other metrics like Define-Use. For those who don't know what CC is, it is defined as the minimal number of linearly independent paths within a unit of code (like a method).
The blog above provides a correlation between the number of CC paths and the number of unit tests. I think any unit test is better than no unit test, but I believe their correlation only applies if you don't know what the CC paths are.
For full disclosure, I am a co-founder of a product company (www.codign.com). The products are Eclipse plug-ins that design unit tests based on Cyclomatic Complexity AND Define-Use. We have a new release coming out very soon, which takes a very interesting spin on commodity metrics, but I'll blog about that later :)
Having worked for Tom McCabe for many years (he came up with Cyclomatic Complexity), I understand the benefits and pitfalls to this metric. The calculation is very simple: CC equals the number of decisions + 1. The proof is a bit more complicated :) , but that's the gist of it.
So, for example, a method with 2 decisions has 4 total paths and 3 CC paths. Let's take this example:

The questions I hear are, what are the three CC paths? Shouldn't you try to test them all? Can you test them all?
To figure out what the three CC paths are, you have to define your basis path (or path #1). This is very much a judgment call - it can be the longest path, the shortest path, the happy path, and so on. For us at Codign, our Path 1 (aka our Basis Path) is the path that executes all the true branches in a method. From there, or from any basis path, you simply work your way down by flipping decisions.
With the example above, there are four TOTAL paths:
Path 1: TT (both if statements execute 'TRUE')
Path 2: TF (first if statement is true, second is false)
Path 3: FT (first if statement is false, second is true)
Path 4: FF (first if statement is false, second is false)
There are 2 sets of CC paths:
Set 1
Path 1: TT
Path 2: FT
Path 3: TF
Set 2
Path 1: FF
Path 2: TF
Path 3: FT
Look at the code again and you'll notice that Path 1 from Set 1 (TT) cannot be tested (it is unrealizable). But why not test the other two paths? Given this information, is the effort required to test the other remaining basis paths that difficult?
As an aside, ever measure branch coverage? The above example can be tested with two tests (TF, FT) which would produce 100% branch coverage.
My point is this ... don't take CC at face value. I don't agree that, for example, if you have 20 CC paths you should only write 10 unit tests. I believe that, if a CC path is testable, you should test it.
But like I said earlier, not all CC paths are realizable, so is there a better way to identify the minimal number of unit tests? Ones that are realizable and measurable? There is, but I will blog about that later. :)
I like their approach to using metrics to figure out the number of unit tests needed for a method. Actually, metrics should be used to determine the minimal number of unit tests. Cyclomatic complexity (CC) can be used for this, but so can other metrics like Define-Use. For those who don't know what CC is, it is defined as the minimal number of linearly independent paths within a unit of code (like a method).
The blog above provides a correlation between the number of CC paths and the number of unit tests. I think any unit test is better than no unit test, but I believe their correlation only applies if you don't know what the CC paths are.
For full disclosure, I am a co-founder of a product company (www.codign.com). The products are Eclipse plug-ins that design unit tests based on Cyclomatic Complexity AND Define-Use. We have a new release coming out very soon, which takes a very interesting spin on commodity metrics, but I'll blog about that later :)
Having worked for Tom McCabe for many years (he came up with Cyclomatic Complexity), I understand the benefits and pitfalls to this metric. The calculation is very simple: CC equals the number of decisions + 1. The proof is a bit more complicated :) , but that's the gist of it.
So, for example, a method with 2 decisions has 4 total paths and 3 CC paths. Let's take this example:

The questions I hear are, what are the three CC paths? Shouldn't you try to test them all? Can you test them all?
To figure out what the three CC paths are, you have to define your basis path (or path #1). This is very much a judgment call - it can be the longest path, the shortest path, the happy path, and so on. For us at Codign, our Path 1 (aka our Basis Path) is the path that executes all the true branches in a method. From there, or from any basis path, you simply work your way down by flipping decisions.
With the example above, there are four TOTAL paths:
Path 1: TT (both if statements execute 'TRUE')
Path 2: TF (first if statement is true, second is false)
Path 3: FT (first if statement is false, second is true)
Path 4: FF (first if statement is false, second is false)
There are 2 sets of CC paths:
Set 1
Path 1: TT
Path 2: FT
Path 3: TF
Set 2
Path 1: FF
Path 2: TF
Path 3: FT
Look at the code again and you'll notice that Path 1 from Set 1 (TT) cannot be tested (it is unrealizable). But why not test the other two paths? Given this information, is the effort required to test the other remaining basis paths that difficult?
As an aside, ever measure branch coverage? The above example can be tested with two tests (TF, FT) which would produce 100% branch coverage.
My point is this ... don't take CC at face value. I don't agree that, for example, if you have 20 CC paths you should only write 10 unit tests. I believe that, if a CC path is testable, you should test it.
But like I said earlier, not all CC paths are realizable, so is there a better way to identify the minimal number of unit tests? Ones that are realizable and measurable? There is, but I will blog about that later. :)
Labels: cyclomatic complexity, java, junit, unit testing

0 Comments:
Post a Comment
<< Home