Wednesday, September 11, 2013

Down the Rabbit Hole of Clustered Defects

This article was spawned by my response to another blog entry I read recently concerning the Pesticide Paradox: .

The Pesticide Paradox simply states that static test cases become stale and unproductive over time. The article above goes into tactical test methods of changing up the test cases to account for that affect and continue to find new defects. My contention in the comment was that the Pesticide Paradox can have some deep and subtle implications that should be considered when creating these new test case variations. Here, I elaborate on some of those considerations.

The Relationship of the Pesticide Paradox to Training

One may start correlating defects found with test cases and find that only new test cases produce defects. One reason for that could be training. In a quick, session-based, exploratory test sessions, the tester comes to rely on recognition of defect patterns that they are familiar with in those test cases that are already documented. Likewise, developers begin to identify those practices that produced those defects. This can be thought of as a form of cognitive bias, where the creation of new test cases serves to train the tester in different bug patterns. This continual establishment of new and different patterns is one reason that session-based testing is so effective  when combined with a risk-based test area identification framework.

Possible Relationships Between Multiple Clusters

Each defect cluster may identify a specific flaw in the code development. If the coders didn't identify the root cause of the flaws and change the practices, then these clusters may occur many times in different places. Was there a pattern to those clusters? This is can be thought of as a form of model-based testing, where we are looking for the underlying cause of the clusters. Superficially zeroing in on clusters without this overall view of the cluster patterns or modeling of underlying cause of the clusters would have limited value in improving test quality.

Layering of Defect Clusters

The clustering may be layered in complexity. New test cases may extend the previous tests down one layer without addressing the root cause of the defects. Example: A tester submits a set of defects one iteration where the GUI doesn't identify string overflows to the user. That is fixed with GUI field checks in all of the identified places. Next iteration, a middle-tier defect is found when an long string is entered in a field not addressed. Then the next iteration, database errors in the log are found that are due to other fields that are not addressed. Instead, the system architecture as a whole should have been evaluated initially to identify the extent of the problem.

These are just a few of the potential complexities that defect clustering can create. Actively questioning the underlying reasons for the clusters is always necessary to insure that the cause of the defects is addressed.

Sunday, August 25, 2013

On the Trail of the White-Tailed Defect

Recently, I read an article about evolutionary biologist Dirk Semmann of the Universtity of Göttingen in Germany. His research suggests that the white tails on rabbits is a defense mechanism. Simply put, a predator will focus on the bright tail instead of the animal. As a result, when the rabbit turns sharply the predator momentarily loses sight of the rabbit and it is better able to get away.

Now apply that concept to defects in complex systems. You (the predator) are on the hunt for defects (the rabbits) in the system to prevent release bugs (☼). Some of the defects are slow, some are fast, some are camouflaged, and some are very, very, sneaky. This article deals with one of the most elusive defects in wild - the White-Tailed Defect. This defect gets you to focus on an obvious trait that always appears to identify it. It is such an obvious target that you decide it is a perfect candidate for automation. You can then spend your limited time finding those pesky camouflaged defects you know are still hiding in the system.

Everything runs smoothly until release day, when the developers decide to make a small improvement in the application that should have no real impact. Then the White-Tail Defect strikes and deftly slips by your well-designed defenses.

Here is one example White-Tail Defect. You are testing for field overruns on the string fields and find that the developers have implemented a standardized GUI check that displays a dialog warning to the operator when the field is overrun. It is built into the GUI framework and reliably provides the same dialog when the check is run. You automate the check for the dialog and all is well. Then, a field is modified by a coder that is new to the team and forgets to add the GUI overflow check. It just so happens that your automation never checks the actual value stored in the database or checks the database error logs.  It isn't until after release that the system error shows up in the customer database (☼).

So, how do you hunt a White-Tail Defect? The key is to understand your prey and adjust your tactics to compensate. The white-tail defense is a cognitive trick that plays on the mind's need to see patterns in things. By suddenly breaking out of that pattern, the defect can escape detection. Here are two tactics for making that less likely:

  • Hunt In Packs - You can fool some of the testers some of the time, but a well-coordinated team can catch most of the White-Tail Defects out there. Communication, coordination, and open dialog on changing up your test approaches in a session-based format leave few places to hide.
  • Layer Your Attacks - The White-Tail Defect can slip by at the GUI, or at the database, or at the unit level, but that trick it pulls is not as effective when facing multiple strategies. Mix up tours, steel threads, sessions, quick tests, and automated checks to place multiple barriers in the way of the defect.
Good hunting!