Wednesday, September 11, 2013

Down the Rabbit Hole of Clustered Defects

This article was spawned by my response to another blog entry I read recently concerning the Pesticide Paradox:

http://www.softwaretestingclub.com/profiles/blogs/defect-clustering-pesticide-paradox .

The Pesticide Paradox simply states that static test cases become stale and unproductive over time. The article above goes into tactical test methods of changing up the test cases to account for that affect and continue to find new defects. My contention in the comment was that the Pesticide Paradox can have some deep and subtle implications that should be considered when creating these new test case variations. Here, I elaborate on some of those considerations.

The Relationship of the Pesticide Paradox to Training

One may start correlating defects found with test cases and find that only new test cases produce defects. One reason for that could be training. In a quick, session-based, exploratory test sessions, the tester comes to rely on recognition of defect patterns that they are familiar with in those test cases that are already documented. Likewise, developers begin to identify those practices that produced those defects. This can be thought of as a form of cognitive bias, where the creation of new test cases serves to train the tester in different bug patterns. This continual establishment of new and different patterns is one reason that session-based testing is so effective  when combined with a risk-based test area identification framework.

Possible Relationships Between Multiple Clusters

Each defect cluster may identify a specific flaw in the code development. If the coders didn't identify the root cause of the flaws and change the practices, then these clusters may occur many times in different places. Was there a pattern to those clusters? This is can be thought of as a form of model-based testing, where we are looking for the underlying cause of the clusters. Superficially zeroing in on clusters without this overall view of the cluster patterns or modeling of underlying cause of the clusters would have limited value in improving test quality.

Layering of Defect Clusters

The clustering may be layered in complexity. New test cases may extend the previous tests down one layer without addressing the root cause of the defects. Example: A tester submits a set of defects one iteration where the GUI doesn't identify string overflows to the user. That is fixed with GUI field checks in all of the identified places. Next iteration, a middle-tier defect is found when an long string is entered in a field not addressed. Then the next iteration, database errors in the log are found that are due to other fields that are not addressed. Instead, the system architecture as a whole should have been evaluated initially to identify the extent of the problem.

These are just a few of the potential complexities that defect clustering can create. Actively questioning the underlying reasons for the clusters is always necessary to insure that the cause of the defects is addressed.

Sunday, August 25, 2013

On the Trail of the White-Tailed Defect

Recently, I read an article about evolutionary biologist Dirk Semmann of the Universtity of Göttingen in Germany. His research suggests that the white tails on rabbits is a defense mechanism. Simply put, a predator will focus on the bright tail instead of the animal. As a result, when the rabbit turns sharply the predator momentarily loses sight of the rabbit and it is better able to get away.

Now apply that concept to defects in complex systems. You (the predator) are on the hunt for defects (the rabbits) in the system to prevent release bugs (☼). Some of the defects are slow, some are fast, some are camouflaged, and some are very, very, sneaky. This article deals with one of the most elusive defects in wild - the White-Tailed Defect. This defect gets you to focus on an obvious trait that always appears to identify it. It is such an obvious target that you decide it is a perfect candidate for automation. You can then spend your limited time finding those pesky camouflaged defects you know are still hiding in the system.

Everything runs smoothly until release day, when the developers decide to make a small improvement in the application that should have no real impact. Then the White-Tail Defect strikes and deftly slips by your well-designed defenses.

Here is one example White-Tail Defect. You are testing for field overruns on the string fields and find that the developers have implemented a standardized GUI check that displays a dialog warning to the operator when the field is overrun. It is built into the GUI framework and reliably provides the same dialog when the check is run. You automate the check for the dialog and all is well. Then, a field is modified by a coder that is new to the team and forgets to add the GUI overflow check. It just so happens that your automation never checks the actual value stored in the database or checks the database error logs.  It isn't until after release that the system error shows up in the customer database (☼).

So, how do you hunt a White-Tail Defect? The key is to understand your prey and adjust your tactics to compensate. The white-tail defense is a cognitive trick that plays on the mind's need to see patterns in things. By suddenly breaking out of that pattern, the defect can escape detection. Here are two tactics for making that less likely:

  • Hunt In Packs - You can fool some of the testers some of the time, but a well-coordinated team can catch most of the White-Tail Defects out there. Communication, coordination, and open dialog on changing up your test approaches in a session-based format leave few places to hide.
  • Layer Your Attacks - The White-Tail Defect can slip by at the GUI, or at the database, or at the unit level, but that trick it pulls is not as effective when facing multiple strategies. Mix up tours, steel threads, sessions, quick tests, and automated checks to place multiple barriers in the way of the defect.
Good hunting!

Saturday, October 8, 2011

Spinning Down

Ok, so we just got our latest release out the door - it's time to relax, right? Working as the single tester on a small product test team has some advantages, but at the end of the day the buck stops here. Stress leading up to a major release is manageable, but what do you do after the release is out the door?

That adrenaline keeps pumping for a while and it takes time to spin down. Here are some thing I do to help manage the days after a release:

• Don't arrange a family commitment the weekend after a scheduled release. No matter how well you prepare, there may be a scheduled delay. That is one pressure you can do without.

• If your spouse asks how the release is going, the answer is "Fine" - no matter what. You aren't lying, just reassuring. Being asked about stress at work is also one pressure you can do without.

• Plan on spinning down gradually. It is out the door, but you could still have last-minute install problems crop up.

• Keep the schedule sane in the week leading up to and following a release. Not getting enough sleep means making mistakes.

• Don't make formal arrangements for a "relaxing" activity. Step back into your normal routine. Plan nothing and let it happen.

• Don't try to force yourself back into a "normal" sleep pattern. Starting a week or so before a release, my body tends to wake up at 2:00 AM no matter what. That continues for a while after a release and I just don't worry about it too much.

• Mental and physical go hand in hand during stressful periods. With me, it's stress, coffee, and spicy food - pick any two out of three. Around a week or so before and after a release, it's usually time to go cold turkey on the caffeine.

Oh, last one ... write a blog. Sharing with others online is a good way to cut the stress. Hey, it really does help!

Tuesday, August 23, 2011

Using Automated Scripts for Test Workflow Automation

Test automation covers a large area from large, traditional test management suites to simple text editors. This discussion focuses on using scripted automation tools to support and improve the overall test workflow. This workflow support can be provided using scripted automation at various interfaces: database, development environment module interfaces, or GUI interfaces to provide some examples.

Traditionally, scripted automation has been used to run checks to verify established application functionality (e.g. for regression checks). An alternative usage for automated scripts is to assist in executing the overall test workflow. Two methods of accomplishing this are presented here:
  • Performing smoke tests
  • Creating complex configurations to support test sessions

Smoke Tests
Smoke tests are a special type of test that does not belong in the category of regression testing. Regression tests are intended to thoroughly verify functions at a broad spectrum of interface points. The smoke test is intended to perform a specific function: to provide a minimum gateway for allowing development builds into the QA environment.

The smoke test performs a quick check of the overall application "happy paths" to identify major functional failures. Here a "major" failure is defined as one that prevents the testing of a significant portion of the application. Unlike regression checks, the smoke test is not intended to thoroughly verify any particular function. In fact, if properly designed, it should not be susceptible to minor failures at all. Instead, it should interact with a minimum of interface objects to limit the likelihood of a minor failure.

In addition, the time box limitation of a smoke test puts an emphasis on getting "the biggest bang for the buck". The smoke test should be continually tweaked to include as many major functions as possible and still complete within the designed time limit (typically 30 minutes to 1 hour), especially for GUI test automation.

Creating Complex Configurations to Support Test Sessions
When considered as a workflow framework, the scripted automation takes on the role of performing a set of tasks as opposed to verifying the functionality of the application. The size and complexity of the automated scripts can be critical. This is due to the fact that the likelihood of a critical stoppage grows exponentially with the size of the script. For example, a critical stoppage early in the processing of a large script would impact the entire flow. Dividing the overall test workflow into ten separate scripts, may limit the impact of a critical stoppage in one of the scripts to 10% of the overall testing.

The size and placement of the scripts in the overall test process should be balanced between usability, run time, maintenance, and the frequency of script stoppages. In general, the scripted automation should be targeted for tests that have very complicated setup procedures or involve a large amount of redundant setup steps that would be a large burden on the tester if performed manually.

These are just two example of using scripted automation to support test workflow. If implemented properly, the injection of small, well designed scripts into the test process can provide a significant improvement in the overall test quality.

Sunday, August 21, 2011

Scripted Automation as a Magic Eight Ball

First, a little background. In American billiards, the game of "Eight Ball" is a game where the main goal is to sink the number "8" ball after you have sunk all of your others (either striped or solid). However, if you "scratch" (sink the cue ball) while trying to sink the eight ball, you lose the game. As a result, the game outcome is always in doubt.

A brilliant person once thought of the idea of creating a "Magic Eight Ball". This is a toy that looks like an eight ball, but has a flat side with a window in it. Inside is a fluid with a multi-sided object that has a phrase written on each side. You ask the Magic Eight Ball a question, shake it, then turn it over and see the displayed answer, which is always a vague answer like "It is possible" or "Who knows?".

So, how does this relate to scripted automation? A major problem with the larger test tool suites is that they are set up to relate script outcomes directly to requirements on a pass/fail basis. This leads to automated reports that provide coverage in terms of requirements passed, etc. This can be not only misleading, but dangerously so. It inevitably leads to a false sense of security or panic that degrades the credibility of the test team.

Scripted test tools do not test or verify functional requirements. Instead, they check specific parameters at specific interface points. When I report scripted results, I only report a "negative" result and always state what was expected and what was measured. That allows me to quickly validate the test failure before entering a bug.

If you have to associate tests directly to functional requirements, you can't say it "passed" and you definitely can't say it "failed" without a verification. Instead, it would be better to phrase it in less absolute terms. For passing tests, you could report that "The outcome is unclear" and for failing tests you could report "The signs are troubling". That way, no one jumps to conclusions and you end up performing exploratory test sessions to provide more specific outcomes. Add a little randomization and ... voila!  The Magic Eight Ball test automation suite that combines the simplicity of direct association to requirements combined with just enough uncertainty to insure thorough testing.

I wonder if I could add a tarot card interface for projected test completion dates? Hmmm ...

What this Blog is About

I am a tester at heart and switched careers in the past to whatever field is currently willing to pay me to do what I love doing.

As a software tester, I am learning the software testing profession by scouring the web for techniques and applying those my job on a daily basis. One of my firm beliefs is that to learn, you have to teach. Web forums provide an excellent method for doing that.

Currently, I am a hardware tester learning the intricacies of the National Instruments TestStand test management framework interfacing with LabWindows/CVI.

This blog is intended to provide me a forum for displaying ideas without overrunning or dominating the posts on those other community web sites.

Enjoy!