Category Archives: Test Methods & Metrics

Computer Scientist – D. Richard Kuhn will provide some insights on how to become a software tester and shares his interest in combinatorial testing.

D. Richard Kuhn - Computer Scientist, National Institute of Standards & Technology

LogiGear: How did you get into software testing? What did you find interesting about it?

Mr. Kuhn: About 10 years ago Dolores Wallace and I were investigating the causes of software failures in medical devices, using 15 years of data from the FDA. About that time, Raghu Kacker, in NIST’s math division, introduced me to some work that his colleague at Bell Labs, Sid Dalal, had done on pairwise and interaction testing for software. The idea behind these methods is that some failures only occur as a result of interaction between some components. For example, a system may correctly produce error messages when file space is exhausted or user input exceeds some limit, but crashes only when these two conditions are both true at the same time. Pairwise testing has been used for a long time to catch this sort of problem.
I wondered if we could determine what proportion of the medical device failures were caused by the interaction of two or more parameters, and just how complex the interactions would be – in other words, how many failures were triggered by just one parameter value, and how many only happened when 2, 3, 4, etc. conditions were simultaneously true. We were surprised to find that no one had looked at this question empirically before. It turned out that all of the failures in the FDA reports appeared to involve four or fewer parameters. This was a very limited sample in one application domain, so I started looking at others and found a similar distribution. So far we have not seen a failure involving more than 6 parameters. This does not mean there aren’t any, but the evidence so far suggests that a small number of parameter values are involved in failures. The reason this is significant is that if we can test all 4-way to 6-way interactions, we are likely to catch almost all errors, although there are many caveats to that statement and we don’t want to oversell this.

LogiGear: You are working in very special field of software testing. It’s quite different to common software testing like: Unit tests, Automation testing, GUI Testing, etc…. So what kind of work are you doing? How did you pick those specific topics? Can you help explain this type of testing to our readers?

Mr. Kuhn: NIST’s mission is to develop better measurement and test methods, so Raghu and I proposed an internal project to build on the empirical findings discussed earlier. The first problem that had to be solved was to find ways of efficiently generating tests that cover all t-way combinations, for t=2, 3, 4, etc. This is a hard mathematical problem that has been studied for a century or more. Jeff (Yu) Lei had developed a very efficient algorithm to cover 2-way combinations as part of his dissertation at NC State, so we talked to him about extending the algorithm for t-way combinations. He did this successfully and we worked with him to incorporate the algorithm into a practical tool for testers, which is now being used at more than 300 companies.
The tool, ACTS, makes it easy for testers to enter parameter names and values for each, then generates test values that cover 2-way through 6-way combinations. For realistic applications, this can produce thousands of tests in some cases, so we have also worked on ways to automate the oracle problem for testing – how to determine the expected results for a particular set of input. This allows testing to be (almost) fully automated, so running hundreds or even thousands of tests can be done at a reasonable cost.

LogiGear: How can a college student prepare to go into software testing and become really good at it? What should he/she look for in teachers, courses, and methods?

Mr. Kuhn: The biggest challenge in software assurance is how to do it economically. Assurance methods used for life-critical software, such as in aviation, are too expensive for most applications, yet every year we become more dependent on software working correctly. One of the best ways to reduce cost, as in other fields, is to bring more science and technology to bear on the problem, so courses that focus more on the science than on managing testing will be important. Testing is only one part of assurance. An important area of research is how to integrate formal methods for specification and proof with testing, including combinatorial interaction testing.

LogiGear: What sort of graduate programs? Also, in your opinion, what are some of the more interesting research questions people are asking now and what do you think they’ll be researching in, say, 5 years?

Mr. Kuhn: Graduate programs with an established program in software engineering will be good choices for work in software assurance. In addition to how to integrate formal methods with combinatorial testing, important questions include: how to order or prioritize tests to find faults more quickly, how to identify the particular combination(s) that caused a failure, how to extend these methods to very large problems, and above all, determining the effectiveness of these methods in the real world. It would be nice if these problems were solved quickly, but I’m sure they will still be important in 5 years. Combinatorial testing has grown very rapidly in the past 10 years, from around 4 or 5 papers a year in the 1990s though 2002, to more than 50 per year recently, so it seems to be attracting a good deal of attention from researchers.

LogiGear: Lastly, who do you consider to be some of the leaders in this field and what are they doing?

Mr. Kuhn: In addition to Jeff Lei (U. Texas, Arlington) and Jim Lawrence (George Mason U.), who have been part of our team since the beginning, Charles Colbourn, at ASU, is the expert on t-way covering arrays and algorithms. We’re working with Renee Bryce (Utah State) and Sreedevi Sampath (U. of Maryland Baltimore County), who are looking at test prioritization. Myra Cohen (U. Nebraska-Lincoln) has done a good deal of work on both the theory and application of these methods. There are many other leaders including Sid Dalal, George Sherwood (from former AT&T, Bellcore, SAIC) , and Alan Hartman (from IBM)

A lot of people are now working on getting these methods into practice. Among those we are working with are Jim Higdon at Eglin Air Force Base and Justin Hunter of Hexawise. Turning research ideas into something that works in the real world can be more challenging than the research in some ways.

LogiGear: Thank you, Mr. Kuhn

The REAL Costs/Benefits of Test Automation

Are you frustrated with vendors of test automation tools that do not tell you the whole story about what it takes to automate testing? Are you tired of trying to implement test automation without breaking the bank and without overloading yourself with work? I experienced first hand why people find test automation difficult, and I developed useful ways to cut testing costs. We must focus on simple tools that produce results. Testing is like systems development. If you want quality results, start with quality requirements. You should not start with test automation; you start with an organized approach to QA testing that will facilitate test automation. This paper explains how you can succeed when you address the REAL Costs/Benefits of Test Automation.

Testing, or test automation, is not rocket science. Some people make it more complicated than necessary. All you need to succeed is a good testing process that embodies a vision of what works and what does not work. As you select applications ripe for test automation you will find some that are better tested with manual approaches. Your overall approach must be consistent. You must avoid the cost of duplication. Most people are not even sure what they mean by test automation. What is test automation? Is test automation simply capture/replay processing? Is test automation programming in some script-like language?

Capture/replay works to fill time slots on a glut of TV channels, since all you change are the commercials to be aired. For software testing, think of the corollary of an old saying: “The more you want things to stay the same, the more things have to be changed”. If you want to replay tests, isn’t that because you want to review the effects of change in the application? You will soon discover that updating playback scripts is a maintenance nightmare, since each case is a unique recording.

Contrary to what some experts claim, test automation should not be program development. Linda Hayes called that “writing programs to test programs” and she rightly classified that idea as absurd. Some test automation product vendors want us to believe in a programming paradigm. Many experts lament that testers are too busy with manual testing to write test automation scripts. Some testers may not be qualified to write such test scripts. Is anyone ready to propose we ask development to double their workload? I didn’t think so.

To develop any software application, you have to start with fundamentals. Script writing is labor-intensive. It is difficult to maintain scripts. Generally, our initial testing is best done manually, so that we can stabilize application interfaces before we attempt test automation. If we use different OS/Browser combinations, we face the challenge of automating so many interfaces that it will seem a lot less work to just test combinations manually. Of course, the fact that so many tools are only concerned with GUI testing should be a concern to us. Don’t we test batch systems? Don’t we need to test individual layers in an N-tier server structure?

People may demonstrate “relative payback alternatives” of manual testing vs. automated test execution in terms of how many test cycles for automation to pay off. What does that mean? Why not focus on the challenge of employing automated tools for any kind of testing? Why not engineer solutions that eliminate a supposed inherent duplication of first establishing manual test scripts and then repeating the effort to produce automated test scripts? Well, IBM has published study results that claim 75% of all testing is still done manually. That means no solution is complete unless it addresses this larger part of the testing needs. I will explain how I created a solution that dramatically reduces the effort of test script creation, and especially of test script maintenance, that provides scripts for manual testing as well as for test automation.

Think about those analyses of breakeven points of manual vs. automated testing that cite a number of automated test sessions after which automated testing becomes cheaper. They seldom account for the need to maintain both manual scripts (as confirmed by IBM) and automation scripts. Many break-even analyses do not tell you how to account for the costs of manual script production plus for the incremental costs of added automation. What I will explain is how we can bring the cost curve down for manual testing by streamlining the scripting, and how some manual testing can be replaced by test automation (but not in terms of replacing all manual testing as some breakeven points imply). The biggest obstacle in the way of progress is to set unrealistic expectations of what people may gain from a specific decision. Overselling the benefits of test automation does nobody any good.

I focus on creating test scripts and eliminating unnecessary duplication. Test automation should be viewed like any other business process automation. I want a solution that meets the needs of my business. If that solution must simplify manual script writing, then that is the focus of automation. I use a practical solution for creating and maintaining test scripts by separating the fixed and variable aspects of scripts. A small set of scripts may be combined with a table of alternate data values so you do not have to replicate scripts for individual combinations. You may add a second table to list test database access keys for individual test cases. The reason is that, over time, a test database tends to change. The indirect approach lets you access the right test database records, while you avoid test script changes due to test database changes.

You can use my solution manually, but then testers must consult 3 sources of data in order to execute those scripts + data values + profile definitions. I created an automated solution that merges these 3 sources dynamically just prior to a test execution, so that testers see a “virtual” instance that contains the current data values. It is possible to cut script creation costs by 50% using this strategy, and maintenance by at least 75%. Database changes just prior to the start of testing are no problem: you can update the profiles and generate a new set of scripts that same afternoon, rather than to search through a stack of test scripts to find all the instances that will have to be changed.

I implemented the process used to support this “manual test automation” solution as a VBA macro within Excel™. I added data generation capabilities with that logic, to identify data by attributes, to identify pairs/triplets/quads for optimized combinations, to set mapping tables for equivalent values, and to define Governing Business Rules that determine what are testable condition combinations. I believe in fundamental testing, to relate test cases and scripts to business requirements and functional specification and to incorporate risk-based testing priority setting. These capabilities support complex script requirements with almost no extra effort on the part of the QA analyst.

By contrast, compute-bound data generation efforts are generally avoided in most manual script preparation initiatives. Shortcuts can seriously compromise the credibility of what is actually tested. With my tools I can account for what I test and why, which is an important automation benefit. I can provide reassurance that all that testing is based on fundamental requirements for the application, not based on what developers thought they had to build, which can be a major source of functionality bugs. Because manual scripting tends to be done in a hurry, QA analysts may look for the easiest source of requirements, so that tests are influenced by what the developers think they should be implementing.

Functional testing is not the only game in town. We need unit testing and integration testing that may not be done properly due to the effort involved in producing needed test cases. With a test framework (such as modeled after JUnit), I can use my tools to produce test data using the same test cases that the application must be able to handle in black-box testing. I output data into “*.CSV” files that can be input into a test framework to present alternative tests and to validate the results as well, so developers can stop bugs from infecting the code before it even reaches QA.

I demonstrated test execution automation with Certify™ (by Worksoft) because the internal working tables from my tools are directly compatible with record-sets in Certify™. A large part of the test automation engineering was done with cut-and-paste operations. I wrote a simple, reusable driver script for Certify™ that mimics my table-driven script generation process. That driver script invokes simple task oriented action scripts, such as specific screen dialogs, and Certify™ inserts the relevant data for each instance that must be tested. The Certify™ keyword-driven architecture allows me to produce automated script segments in hours. My data-driven architecture produces full execution automation within days of debugging the manual scripts. I use one common input, my source specifications, that I can update in order to regenerate scripts and record-sets in minutes. This all but eliminates script maintenance concerns and removes the logistics of managing manual test scripts in parallel with automated test scripts. I know that not everybody uses Certify™. Some people already bought into more elaborate products (perhaps based on writing test dialogs the hard way in VB Script). Initial feedback was that my approach was fine for a select user community, but not for a majority of users of test automation software. Some skepticism is healthy, too much is paralyzing! I like to think of solutions that are easily understood by most testing analysts. I want quick payback from test automation.

There is no benefit from products that end up as “shelf-warer” because they consume more resources than what they return in benefits. Certify™ is good, but really no single product is sufficient. Look at the complex IT solutions with different products integrated into a complex production environment. Ask if it is realistic to want a one-tool-solves-all solution. Look beyond individual tools and consider multiple tools. You need more software and additional learning, which adds complexity to integrate those tools to meet specific testing challenges.

To support other test automation products I created a second tool. I use the same data and I created a generic keyword structure to generate executable scripts for any procedural testing tool. This requires custom OPDEF’s (operation definitions) that use keywords to select script code segments. My tool can insert appropriate data values to make each script instance functionally operational. It takes a little more effort than if you use Certify™. First you need to write the same simple task oriented action scripts as explained for Certify™. You also need to write reusable OPDEF’s that convert keyword based actions into script code that can then be executed by the targeted test automation product. This way you can continue to use your existing test automation software if you are able to feed it directly from the same inputs we use to generate manual scripts. When changes are presented, you simply pull out the input workbook, you make the changes, and you regenerate your test collateral right down to your favorite test automation scripts, usually within the same day. If you need to work with multiple test automation tools, you can provide multiple OPDEF libraries and compile the appropriate scripts with the right set of definitions for each tool.

In summary, my solution requires only one level of maintenance effort that we minimize by using a 3-pronged approach of scripts + data + profiles. Any extra effort to produce the initial input data to ensure that your testing is sufficiently thorough adds extra cost to the script preparation effort. With my approach, the cost of initial scripting will often be reduced, and maintaining that script base is clearly much more efficient than if done manually. Consider:

  • Scripts change because physical interfaces change. We can design scripts so that we use the fewest lines of unique code per GUI screen segment (or transaction file, etc.). Such changes are simple to implement with a tool like Certify™ that is able to “re-learn” a screen. We can eliminate duplication of low-level dialogs to dramatically lower the testing costs. The risk of script changes is so common that projects wait as long as possible before they start scripting. You cannot escape that risk in regression testing. Unless you avoid duplication of low level scripts, maintenance efforts will be significant. With my reusable OPDEF architecture you cannot minimize scripting any further unless you use a keyword-driven architecture that we use in Certify™.
  • Data change when new functions support new (or additional) conditions or options. If data are entrenched in scripts it is difficult to find what you must change or add in the scripts to reflect new conditions. By keeping the data separate, it is easy to focus on data instances that are affected by application changes. We can update those data and regenerate scripts relatively quickly (typically in an hour or two after we receive the changes). This includes all the testing collateral necessary for the execution of test cases that reflect the new version of the application, which is a significant advantage.
  • Profiles change because databases are not static. We make test scripts adaptable by using “profile” references in the data and by providing separate “definition” data to map each profile to actual database keys. There are many aspects to this concept. As data become stale, they are edited or replaced, and you reflect that in the profile definitions. Since you can describe data attributes for a profile you keep a clear focus of what you are testing, while account references become confused when the underlying data are subject to change. The extra step of using a profile is a critical step that safeguards your investment in the test collateral by formalizing the design documentation.

These concepts are lacking in most test automation tools. This explains why test projects fail due to multiple duplication opportunities. You need manual testing and test automation, and perhaps different test automation tools to test different aspects in different environments, and transactions files, and updates to database files. Unlike other approaches, I recognized the key issue as creation and maintenance of test scripts, transactions, and test data.

Maintenance of test collateral is onerous. It can consume over 75% of the total costs of small upgrade projects. You think not? Ah, what probably happens is that your testers use only 10% or so of the scripts for regression testing, and they only update that 10% of scripts. As a result that script base becomes corrupt after a few upgrade cycles. I recommend that you keep 100% of all scripts updated, even if you use 10% of those scripts in a given regression testing session. Test automation lets me regression test 100% of all scripts each time. A full regression provides a better level of quality assurance than what you can achieve with a purely manual approach, and at a fraction of the cost.

Keep your efforts to code automation scripts to a dull roar. Avoid “writing programs to test programs” as Linda Hayes cautioned. This is especially true if you need multiple tools to deal with different types of applications under test, because that would take an army of programmers with different skills to keep up with the scripting needs. Keep in mind the limitations of test automation when you deal with web-based applications that must be validated for compatibility with a myriad of customer configurations. You still need manual scripts. My solution uses Excel workbooks, IntelliData.xls and IntelliScript.xls, that automate the procedures described above. I have documentation and data-driven / keyword-driven training materials that I use in my business. Like many “simple” solutions, this did not evolve overnight: it is the result of many trial and error versions of code to solve specific problems. What I see in many testing tools is a constant effort to update the tools due to technological changes. What I see in most testing organizations is a desire to keep the application-specific test scripts as static as possible. Based on that premise my tools bridge the gap between testing tools and business needs and they satisfy the primary goals and objectives for test automation: to cut costs in manual testing as well as in test execution automation. I will be happy to provide you with additional information about my tools and my testing methodology.

Article from QA VisionFrits Bos (Frits@pm4hire.com) is a veteran of over 30 years in IT, and a witness to the tendency to reinvent the wheel. He provides contract services in Project Management, Business Analysis, QA, and BCP. He also develops training seminars and creates development and testing tools, and is an active contributor at the www.stickyminds.com site for QA testing professionals.

Computer Scientist – D. Richard Kuhn will provide some insights on how to become a software tester and shares his interest in combinatorial testing.

D. Richard Kuhn - Computer Scientist, National Institute of Standards & Technology

LogiGear: How did you get into software testing? What did you find interesting about it?

Mr. Kuhn: About 10 years ago Dolores Wallace and I were investigating the causes of software failures in medical devices, using 15 years of data from the FDA. About that time, Raghu Kacker, in NIST’s math division, introduced me to some work that his colleague at Bell Labs, Sid Dalal, had done on pairwise and interaction testing for software. The idea behind these methods is that some failures only occur as a result of interaction between some components. For example, a system may correctly produce error messages when file space is exhausted or user input exceeds some limit, but crashes only when these two conditions are both true at the same time. Pairwise testing has been used for a long time to catch this sort of problem.
I wondered if we could determine what proportion of the medical device failures were caused by the interaction of two or more parameters, and just how complex the interactions would be – in other words, how many failures were triggered by just one parameter value, and how many only happened when 2, 3, 4, etc. conditions were simultaneously true. We were surprised to find that no one had looked at this question empirically before. It turned out that all of the failures in the FDA reports appeared to involve four or fewer parameters. This was a very limited sample in one application domain, so I started looking at others and found a similar distribution. So far we have not seen a failure involving more than 6 parameters. This does not mean there aren’t any, but the evidence so far suggests that a small number of parameter values are involved in failures. The reason this is significant is that if we can test all 4-way to 6-way interactions, we are likely to catch almost all errors, although there are many caveats to that statement and we don’t want to oversell this.

LogiGear: You are working in very special field of software testing. It’s quite different to common software testing like: Unit tests, Automation testing, GUI Testing, etc…. So what kind of work are you doing? How did you pick those specific topics? Can you help explain this type of testing to our readers?

Mr. Kuhn: NIST’s mission is to develop better measurement and test methods, so Raghu and I proposed an internal project to build on the empirical findings discussed earlier. The first problem that had to be solved was to find ways of efficiently generating tests that cover all t-way combinations, for t=2, 3, 4, etc. This is a hard mathematical problem that has been studied for a century or more. Jeff (Yu) Lei had developed a very efficient algorithm to cover 2-way combinations as part of his dissertation at NC State, so we talked to him about extending the algorithm for t-way combinations. He did this successfully and we worked with him to incorporate the algorithm into a practical tool for testers, which is now being used at more than 300 companies.
The tool, ACTS, makes it easy for testers to enter parameter names and values for each, then generates test values that cover 2-way through 6-way combinations. For realistic applications, this can produce thousands of tests in some cases, so we have also worked on ways to automate the oracle problem for testing – how to determine the expected results for a particular set of input. This allows testing to be (almost) fully automated, so running hundreds or even thousands of tests can be done at a reasonable cost.

LogiGear: How can a college student prepare to go into software testing and become really good at it? What should he/she look for in teachers, courses, and methods?

Mr. Kuhn: The biggest challenge in software assurance is how to do it economically. Assurance methods used for life-critical software, such as in aviation, are too expensive for most applications, yet every year we become more dependent on software working correctly. One of the best ways to reduce cost, as in other fields, is to bring more science and technology to bear on the problem, so courses that focus more on the science than on managing testing will be important. Testing is only one part of assurance. An important area of research is how to integrate formal methods for specification and proof with testing, including combinatorial interaction testing.

LogiGear: What sort of graduate programs? Also, in your opinion, what are some of the more interesting research questions people are asking now and what do you think they’ll be researching in, say, 5 years?

Mr. Kuhn: Graduate programs with an established program in software engineering will be good choices for work in software assurance. In addition to how to integrate formal methods with combinatorial testing, important questions include: how to order or prioritize tests to find faults more quickly, how to identify the particular combination(s) that caused a failure, how to extend these methods to very large problems, and above all, determining the effectiveness of these methods in the real world. It would be nice if these problems were solved quickly, but I’m sure they will still be important in 5 years. Combinatorial testing has grown very rapidly in the past 10 years, from around 4 or 5 papers a year in the 1990s though 2002, to more than 50 per year recently, so it seems to be attracting a good deal of attention from researchers.

LogiGear: Lastly, who do you consider to be some of the leaders in this field and what are they doing?

Mr. Kuhn: In addition to Jeff Lei (U. Texas, Arlington) and Jim Lawrence (George Mason U.), who have been part of our team since the beginning, Charles Colbourn, at ASU, is the expert on t-way covering arrays and algorithms. We’re working with Renee Bryce (Utah State) and Sreedevi Sampath (U. of Maryland Baltimore County), who are looking at test prioritization. Myra Cohen (U. Nebraska-Lincoln) has done a good deal of work on both the theory and application of these methods. There are many other leaders including Sid Dalal, George Sherwood (from former AT&T, Bellcore, SAIC) , and Alan Hartman (from IBM)

A lot of people are now working on getting these methods into practice. Among those we are working with are Jim Higdon at Eglin Air Force Base and Justin Hunter of Hexawise. Turning research ideas into something that works in the real world can be more challenging than the research in some ways.

LogiGear: Thank you, Mr. Kuhn

Professor Janzen provides some insight to LogiGear Magazine on how to become a software tester

David S. Janzen – Associate Professor of Computer Science Department California Polytechnic State University, San Luis Obispo – homepage

LogiGear: How did you get into software testing and what do you find interesting about it?

Professor Janzen: The thing I enjoy most about computing is creating something that helps people. Since my first real job building telecommunications fraud detection systems, requirements and design were the most fun. Then when I heard about this thing called test-driven development, and something just clicked. Using unit test specification to do design made sense to me — plus, you get this great side effect of all these automated tests to make refactoring and maintenance easier. I guess I got pulled into software testing by way of software design.

LogiGear: What kind of work are you doing and how did you pick those specific testing topics?

Professor Janzen: My PhD research focused on how test-driven development (TDD) affects the internal, or design quality of software. I did a bunch of experiments with students and software professionals that provided some evidence about the benefits of TDD. The experiments are pretty straightforward:

Get two groups of essentially equivalent programmers, ask them to complete the same tasks, but have one group use the approach you are studying while the other uses the “traditional” approach. Then collect metrics and surveys to see how the two approaches varied.

Having become convinced that TDD is a great way to design and build software, my recent efforts have moved toward incorporating TDD into computing education.

I think TDD is taking the same path that objects took in the early ’90s. Folks in the industry started adopting objects so the broader academic community took notice. However, they mostly considered object-oriented programming to be an advanced concept, so it started to appear in graduate and upper-level courses. As educators became more comfortable with the approach, objects eventually made it down to first-year courses. I think the same is happening with TDD, so I am building tools and doing experiments to demonstrate that TDD can be taught to beginning programmers with great success. I call the approach Test-Driven Learning (TDL).

My current project is a web-based development environment for beginning programmers that will incorporate TDL-inspired labs. It will soon be available at http://web-ide.org.

LogiGear: How can a college student prepare to go into software testing and become really good at it and what should he or she look for in teachers, courses, and methods?

Professor Janzen: Software testing is such a great field to study. My students who are good at software testing often get the best job offers. Most undergraduate computer science and software engineering programs don’t have separate courses in software testing. The best route is to do well in your core programming courses, and then take as many software engineering courses as you can. Many software engineering courses involve team projects. Volunteer to be a software tester or quality assurance manager. Or, take the role of system integrator or build manager.

These roles will give you exposure to many of the automation tools that software testers use, and will help you start thinking about how to break software and not just how to build it.

LogiGear: What sort of graduate programs should college graduates consider? Also, in your opinion, what are some of the more interesting research questions people are asking now and what do you think they’ll be researching in 5 years?

Professor Janzen: There are two routes you might consider in graduate school. If you want to be involved in software development in companies and lead software teams, look at masters in software engineering programs. Many of these programs cater to software professionals who are already working in industry, by offering courses in the evenings or on weekends. If you are interested in more cutting-edge research, such as building new software testing tools, or developing new software testing methods, consider going to a traditional Computer Science PhD program. There are lots of smart researchers working on really interesting testing topics.

In five years? Well a lot of work seems to be focused on automatically generating automated tests, and also on tools for working with models. Look for some of the top software testing conferences and try to attend. Read software testing trade journals — there are plenty online, and many are still published in magazine format. Try to identify and follow interesting and cutting edge active researchers – and maybe if you’re lucky you’ll get a chance to meet these pioneers in person, which can be an exciting and thought provoking experience.

LogiGear: Lastly, who do you consider to be some of the leaders in this field and what are they doing in the field of software testing and development?

Professor Janzen: I am interested in some of the work being completed by Tao Xie and Laurie Williams at North Carolina State University. Tao is doing a lot of work with automated software testing and data mining. Laurie has working on static analysis tools, reliability, security, and mutation testing.

LogiGear: Thank you, Professor Janzen.

What is Test Automation Return On Investment?

What is the Automation ROI ticker?

The LogiGear Automation Return on Investment (ROI) ticker, the set of colored numbers that you see above the page, shows how much money we presumably save our customers over time by employing test automation as compared to doing those same tests manually, both at the design and execution level.

We’ve segmented this page into three sections: one is to clarify our assumptions; then, to provide our audience some definitions, and finally to offer an example of our approach. We hope this helps clarify how we view test automation ROI.

Definitions:

What is a test case and how does LogiGear define this? A test case is a self-contained experiment — a sequence of operational and verifying actions, with a specific set of data, towards the application under test — with a defined outcome.

At LogiGear we reflect test cases as a series of actions or lines of actions. This is quite different from what most testers are used to.

In our ticker, a New Test Case is a test case that is created once during the day by our test engineering teams. We capture this number in every 24 hour cycle.

In our ticker, a Modified Test Case is a test case that has already been created, using the above criteria, but has been modified to accommodate new changes in the application. This might be a new verification point or a new navigation method. We might change that test case several times during the day, and we again, update this number in every 24 hour cycle.

How do we reflect our presumed savings?

The presumed Money Saved is calculated by comparing the total number of test cases that are developed and run automatically to the same number of tests that would be run and developed manually. As noted above in our assumption section, we make some assumptions in our calculation. One being the time it takes to develop and run our keyword based automation vs. the time it takes to develop and run a manual test case. We also factor in low cost offshore services, that maximize not only the resource cost, but the implementation costs.

Here is our simple formula reflecting costs:

As noted above – in our services organization, we generally realize a five (5) fold improvement in test execution and at least a two (2) fold improvement in designing and updating new and modified tests through our keyword method.


Assumptions and Clarifications:

The ticker reflects test cases that are developed and modified daily by our offshore service teams in Vietnam. LogiGear has been helping companies with software testing and test automation for many years. Over those years, we’ve compiled the number of tests we do for our clients as a matter of pride and internal tracking. In that time-frame we have developed over 8 million tests on 3000+ projects (yes, that’s million with an M!).

On to our numbers…

I’m sure we can all agree that executing a manual test takes more time. What we tried to do is average out the time it would normally take to run through a standard manual test case (with the assumption that these tests are required to be executed multiple iterations). We also looked at the time it would take to develop a manual test case, using the standard manual test case narrative approach.

LogiGear’s assumption is that it takes roughly 10 minutes to execute and document a standard test case manually and 6 minutes to write that same test. Again, these are averages – we’re sure some would agree or disagree with these figures.

In test automation, if your test cases are not running faster, then you are doing something wrong. Automation should allow you to execute your test cases at least 3 to 5 times faster than conventional manual testing or 2 minutes for each test in our assumptions. LogiGear also feels that our keyword test design approach, Action Based Testing, should allow testers to write and develop tests faster; this improvement in time should be at least 2 times faster, or 3 minutes for each test in our calculations.

Lastly, we factor in the cost of developing and running these test cases both manually and automatically. The rate we use to calculate is our costs to do this work offshore in our Vietnam Testing Facility, which provides low cost high expertise testing services. The rate we use as a comparison to reflect the savings, is doing this same work in the United States.

Here is a simple matrix of our data we used for our calculation:


An Example of our Approach:

Let’s use a test example of checking a website registration system. This system has a registration dialogue and we want to test it for different inputs. Your organization might write test cases in a classic test case narrative format, like the example below:

The above method is not very friendly to automation, nor is it very efficient test case design.

However, for keyword automation, not only is the test case designed for automation, but it’s also a very efficient way for documenting test.

Below, we took the same set of tests that we created above using a traditional narrative approach with steps and expected outcomes, and reflected them as keywords.

At LogiGear, our tests look like this (plus!! we then automate these keywords):

This is a very efficient way of designing tests and they are in format that allows for easy test automation.


We’re here to help you get the best return on your testing investment possible. You can find case studies on our client success with Action Based Testing and TestArchitect.

Contact us today to find out how LogiGear can save your company money.


Here are some Case Studies that have adopted LogiGear’s low cost test services and automation methods:

{faqslider tabs}TestingROI_Casestudy{/faqslider}

Result Overview from the STATE-OF-THE-PRACTICE Survey

By Michael Hackett

I am Senior Vice President at LogiGear. My main work is consulting, training, and organizational optimization. I’ve always been interested in collecting data on software testing – it keeps us rooted in reality and not some wonkish fantasy about someone’s purported best practice! Just as importantly, many software development teams can easily become myopic as to what they do as being normal or “what everyone does.” I also wanted to do a survey to keep my view on testing practices current. I am always looking to incorporate data I glean about new ideas and methods in my training and consulting programs when I work on for clients at LogiGear.

In 2009 and 2010, I conducted a large survey called “What is the current state-of-the-practice of testing?” I opened the survey in the first part of 2009 and collected data for an entire year. Invitations were sent out to testers from around the world – since software testing is a global trend with experts and engineers having all sorts of ideas on how to do certain methods and practices differently – I wanted to capture and understand that diverse cross-section of ideas.

Some of the data was pretty much what you’d expect, but for some of the sections especially around outsourcing, offshoring, automation and Agile to name a few; the answers were quite surprising.

This first article is designed to give you an introduction into my approach and some preliminary findings. I hope to share with you over the coming months more data and my interpretations of those results.

The Goals

My goal in doing the survey was to move away from guesses about what is happening, and what is common, and move to using actual data to provide better solutions to a wider variety of testing situations. First, we need to better understand the wide diversity of common testing practices already in use and how others are using these processes and techniques for success or failure. In order to make positive changes and provide useful problem solving methods in software development and specifically testing, we need to know what is actually happening at the ground level, not what a CTO might think is happening or wants to happen!
Also, when I write a white paper or article I want it to reference and contrast real-world testing and software development. I hope this will help many teams in checking their practice against some broader test/dev situations as well as give realistic ideas for improvement based on what other teams and companies may really be doing!

The Questions

I wrote the survey on a wide variety of current topics from testing on agile projects; to opinions of offshore teams; to metrics. The survey also featured more question sets on the size and nature of teams, training and skills, the understanding of quality, test artifacts and the politics of testing.

The Sample Set

This was a very large survey, with over 100 multiple choice questions combined with several fill-in essay type responses. The survey was meant to be cafeteria style, that is, testers could choose sections that applied to their work or area of expertise and ignore or skip those that did not apply to them, professionally or by interest. For example, there were sections for “teams that automate,” and “teams that do not automate,” teams that “self-describe as agile”, offshore teams, onshore teams, etc. So no one was expected to complete the entire survey.

Some Sample Responses

Here are some preliminary findings from my survey. Analyzing the entire survey will take more time – but I did want to put out a selection of findings to give you an idea of what type of information I will be sending out. I picked some responses that were interesting because they confirmed ideas or surprising because they are rarely discussed, poor planning, old ideas, or just surprising! I’ve broken them down into four sections, “answers that I expected”, “conventional wisdom that seems validated”, “answers that did not appear uniform”, and some “surprising data” that was in some cases unexpected.

We received responses for 14 countries!

So here we go:

Answers that were along the lines I expected:

Question: Test cases are based primarily on Answer
A – Requirements Documents 62%
B – Subject Matter Expertise 12%

The overwhelming majority of teams still begin their work referencing requirements documents. However good, however bad, complete or too vague – most people start here. I did think the number of teams starting with test cases with workflows, user scenarios – using their subject matter expertise would be higher. How a user completes some transaction or some task – I guess, is still secondary to the requirement.

Convention Wisdom that was validated:

Question: What is the name of your team/group? Answer
A – QA 48.8%
B – Testing 20.5%

This is conventional wisdom, but surprised me. It is definitely a trend – at least in Silicon Valley – to move teams away from the outdated term “QA.” Since the people who test rarely ever, almost never, really do QA. If you are a tester and you think you do QA, please return to 1985. It is interesting, though, that this number calling themselves QA has dropped below 50% — as time goes on this number will continue to drop.

60% of all respondents write test plans for each project

Here is some more conventional wisdom – this can be a great point of interest when you are debating – should I/we write a test plan for each project?

Far from Uniform Answers:

Question: Educational Level (selected responses) Answer
A – High School 3.0%
B – Bachelors of Arts/Sciences 40.0%
C – Some Graduate Work 19.0%
D – Masters Degree 24.6%
E – PhD. 3.0%

It seems conventional wisdom that the vast majority of people who test have university degrees, but I am surprised at how many have done post graduate work, have a master’s degree and have PhDs. It runs against conventional wisdom that people who test are the least trained on the development team, perhaps they are the most educated!

Surprising Data:

34% of all respondants indicated that their regression testing was entirely manual

A very big surprise to me! The lack of automated regression! Wow. That is one of the biggest and most surprising results of the entire survey! Why do 1/3 of teams still do all manual regression? Bad idea, bad business objective.

52% do not test their application/system for memory leaks

The number of teams not doing some variety of memory, stress, DR (disaster recovery), buffer overflow (where applicable) load, scalability, etc. testing was another big surprise. We need to look further into this. Is it bad planning? Lack of tools, skill, lack of knowledge, keeping your fingers crossed? In many cases I bet this is bad business planning.

87% of respondants rank offshoring or outsourcing as “successful”

Such a very high number of people responding that offshoring and outsourcing was successful goes against conventional wisdom that it’s the managers who like outsourcing/offshoring but production staff (the people who actually do the work), are not happy with it!

37% of teams say they do not currently automate tests, with 10% indicating they’ve never tried to automate

That over 1/3 or respondents currently do not automate tests is in line with what I see in my work at many companies but is contrary to popular belief and any sort of best practice. What I see out in the business world is teams that automate think everyone automates and they automate enough. Teams that do not automate see automation as not common, too difficult, not something testers do. This number is way, way too high. Any team not automating has to seriously look at the service they are providing their organization as well as the management support they are receiving from that organization!

Agile Series Survey Results From “The State of the Practice Survey ”

As part of my on-going series on Agile for Testers – see this month’s article on People and Practices, I wanted to include the data I collected Agile development and testing and give you a chance to view them.

Question 1

Have you been trained in Agile Development?
Yes 47.8%
No 52.2%

The fact that more than half of the respondents answered “no” here is troubling in many ways; let’s just stick to the Practices issue. It is clear some of these organizations are calling themselves “agile” with no reality attached. Whether you want to call them “ScrumButts” or refer to them as Lincoln’s 5-legged dog, calling yourself “agile” without implementing practices and training on what this is all about is just not agile! Attempting to be agile without training all the team in the why and how of these practices will fail.

Question 2

Since your move to Agile Development, is your team doing:
More Unit Testing? 50%
Less Unit Testing? 6%
The Same Amount of Unit Testing? 28%
I have no idea? 16%

Ideas to take from this are many: That more “unit” testing is happening in 50% of the responding organizations is a good thing! That more “unit” testing is happening at only 50% of the organizations is a problem. More troubling to me is that 16% have no idea! This is un-agile on so many levels — a lack of communication, no transparency, misguided test efforts — a lack of information on test strategy, test effort, test results — and a lack of teamwork!

Question 3

Does your team have an enforced definition of done that support an adequate test effort?
Yes 69.6%
No 30.4%

This is encouraging. Hopefully the 30% without a good Done definition are not “ScrumButts” and will be implementing a useful definition of done very soon!

Question 4

What percentage of code is being unit tested by developers before it gets released to the test group? (Approximately)?
100% 13.6%
80% 27.3%
50% 31.5%
20% 9.1%
0% 4.5%
No Idea 13.6%

I won’t respond again about the No Idea answer, as that was covered above, but it’s important to know that most agile purists recommend 100% unit testing for good reason. If there are problems with releases, integration, missed bugs, and scheduling, look first to increase the percentage of code unit tested!

The Results

The overriding result is that the current testing practice is quite diverse! There is no single test practice, no one way to test, and no single preferred developer/tester ratio. Everyone’s situations were different and even some similar situations had very different ideas about their product quality, work success and job satisfaction!

My Future Plans

I plan to continue to commission surveys as a regular part of my desire to take a pulse on what is really happening in the software development world — with regard to testing rather than postulations from self-described experts. As noted above, as a result of this being a very large survey, I will be publishing sections over the next few months. I look forward to bringing you exciting as well as troubling trends that I’ve postulated from the data I’ve collected.

Professor Jeff Offutt provides some insight to LogiGear Magazine on how to become a software tester

Professor Jeff OffuttJeff Offutt – Professor of Software Engineering in the Volgenau School of Information Technology at George Mason University – homepage – and editor-in-chief of Wiley’s journal of Software Testing, Verification and Reliability,

LogiGear: How did you get into software testing? What do you find interesting about it?

Professor Offutt:
When I started college I didn’t know anything about computers. I was a math major and in my first semester, my adviser convinced me to take an introductory programming course that was being specialized for math majors. Programming was taught in the business department, so a math-focused class was quite different and the faculty wanted to make sure enough students took it.

I was immediately hooked. Programming was fun and it was easy! (Unlike calculus, which I didn’t like.) It took me a few semester to find out that the pre-engineering majors who thought calculus was easy and fun often found programming hard. I was shocked to find that companies actually paid good money to programmers!

But I was frustrated by how much effort it took to write really good software. Design was poorly done, languages and programming tools were terrible (I started with BASIC and COBOL on punched cards!), and debugging was horrible. So I wanted to do whatever I could to help build quality software. When I got to graduate school I met a professor who was working on testing and I quickly found I could apply my love for discrete math (not continuous math) to making better software. So I’ve worked on testing, and especially automated testing, ever since.

LogiGear: What kind of work are you doing? How did you pick those specific topics, anyway?

Professor Offutt: I’ve found the hardest problem in testing is the key issue of getting test values. Most other aspects are either easy, straightforward, mechanical, or automated. And I’ve had a passion for inventing criteria and algorithms that can automatically generate good tests since my PhD thesis work.

A few years ago I agreed to teach a class on designing and building Web applications. I immediately realized that deploying software on the Web affected every aspect of software engineering, including testing. So I developed a technique for modeling Web software component interactions that can be used for testing (and other activities). I also invented a black-box technique called bypass testing, and I am currently working on a mutation model to test the novel control and state interactions that Web applications use.

I pick problems based on what I have trouble with as a programmer or as a user. And it helps if the problems are interesting to students.

LogiGear: How can a college student prepare to go into software testing and become really good at it? What should he or she look for in teachers, courses, and methods?

Professor Offutt: Testing is truly entering a golden age. The need for software quality has been increasing dramatically, and new ideas like Agile processes put a heavy emphasis on testing. A tester should have two qualities: (1) an innate need to have high quality software, and (2) a very logical mind.

Give me someone who programs a little slowly, but who turns in programs without faults.

LogiGear: What sort of graduate programs should college graduates consider?

Professor Offutt: A testing researcher should be very good at discrete math. Logic and set theory, graphs, grammars and finite state machines — and abstract algebra if she can get it. Look for programs that are based on 21st-century concerns. Look for universities that have lots of software engineering classes instead of the standard one. Do they have an undergrad and graduate course in testing? Do they have more than one? How many faculty list software engineering and software testing as their FIRST research area? Many CS programs are teaching the same material that I learned as a student in the early 1980s. The industry has changed completely — how can all that material still be relevant? The answer is easy: Most of it is not.

LogiGear: Also, in your opinion, what are some of the more interesting research questions people are asking now and what do you think they’ll be researching in, say, 5 years?

Professor Offutt: New technologies (like the Web) are great sources for new software testing research problems. Emergent properties like security and usability are also major growth areas for the near future. Research is like science fiction — it takes 15 or 20 years to go from research ideas to practical use; so what will we need 15 years from now? The Web is a great example. All the ideas were laid out in PhD theses and conference papers in the 1970s and 1980s, then the Web was created from those ideas in about 1990, and finally Web applications were being developed a few years later.

LogiGear: Thank you, Professor Offutt.

10 Essentials for Effective Test Automation

Test automation can provide great benefits to the software testing process and improve the quality of the results…. but its use must be justified and its methods effective.

The reasons to automate software testing lie in the pitfalls of manual software testing…

As we all know too well, the average manual software testing program:

Adapted from the LogiGear white paper

Offshore Software Test Automation: A Strategic Approach to Cost and Speed Effectiveness“.

- Is slow and costly
- Is difficult to manage
- Does not scale well
- Is not consistent and repeatable

Effective test automation resolves each of these issues, allowing management to:

  • Drive down costs
  • Bring software to market faster
  • Gain critical awareness into QA status

How, then, can test automation be made effective?

The most essential element of effective software test automation is a strong foundation in methodology. Methodology drives tool selection and the rest of the automation process. It also helps to drive the approach to offshoring the “appropriate” pieces of the testing process.

Here is the short list every manager needs to make methodology work for them:

10 Essentials for Effective Test Automation:

  1. Know the steps of the software development process and how they relate to each other.
  2. Have a solid understanding of the required planning.
  3. Understand that software testing is a strategic effort.
  4. Commit to giving software testing its own budget and funding.
  5. Use the Action Based Testing (ABT) methodology and choose the right enabling technologies that support it.
  6. Put in place the right people with the proper skills and training.
  7. Separate test design from test automation so that automation does not dominate test design.
  8. Lower costs by using less expensive labor than a local team.
  9. Integrate global resourcing strategies and best practices.
  10. Jumpstart the process with a pre-trained outsourcing partner.

Professor Jeff Offutt provides some insight to LogiGear Magazine on how to become a software tester

Professor Jeff OffuttJeff Offutt – Professor of Software Engineering in the Volgenau School of Information Technology at George Mason University – homepage – and editor-in-chief of Wiley’s journal of Software Testing, Verification and Reliability,

LogiGear: How did you get into software testing? What do you find interesting about it?

Professor Offutt:
When I started college I didn’t know anything about computers. I was a math major and in my first semester, my adviser convinced me to take an introductory programming course that was being specialized for math majors. Programming was taught in the business department, so a math-focused class was quite different and the faculty wanted to make sure enough students took it.

I was immediately hooked. Programming was fun and it was easy! (Unlike calculus, which I didn’t like.) It took me a few semester to find out that the pre-engineering majors who thought calculus was easy and fun often found programming hard. I was shocked to find that companies actually paid good money to programmers!

But I was frustrated by how much effort it took to write really good software. Design was poorly done, languages and programming tools were terrible (I started with BASIC and COBOL on punched cards!), and debugging was horrible. So I wanted to do whatever I could to help build quality software. When I got to graduate school I met a professor who was working on testing and I quickly found I could apply my love for discrete math (not continuous math) to making better software. So I’ve worked on testing, and especially automated testing, ever since.

LogiGear: What kind of work are you doing? How did you pick those specific topics, anyway?

Professor Offutt: I’ve found the hardest problem in testing is the key issue of getting test values. Most other aspects are either easy, straightforward, mechanical, or automated. And I’ve had a passion for inventing criteria and algorithms that can automatically generate good tests since my PhD thesis work.

A few years ago I agreed to teach a class on designing and building Web applications. I immediately realized that deploying software on the Web affected every aspect of software engineering, including testing. So I developed a technique for modeling Web software component interactions that can be used for testing (and other activities). I also invented a black-box technique called bypass testing, and I am currently working on a mutation model to test the novel control and state interactions that Web applications use.

I pick problems based on what I have trouble with as a programmer or as a user. And it helps if the problems are interesting to students.

LogiGear: How can a college student prepare to go into software testing and become really good at it? What should he or she look for in teachers, courses, and methods?

Professor Offutt: Testing is truly entering a golden age. The need for software quality has been increasing dramatically, and new ideas like Agile processes put a heavy emphasis on testing. A tester should have two qualities: (1) an innate need to have high quality software, and (2) a very logical mind.

Give me someone who programs a little slowly, but who turns in programs without faults.

LogiGear: What sort of graduate programs should college graduates consider?

Professor Offutt: A testing researcher should be very good at discrete math. Logic and set theory, graphs, grammars and finite state machines — and abstract algebra if she can get it. Look for programs that are based on 21st-century concerns. Look for universities that have lots of software engineering classes instead of the standard one. Do they have an undergrad and graduate course in testing? Do they have more than one? How many faculty list software engineering and software testing as their FIRST research area? Many CS programs are teaching the same material that I learned as a student in the early 1980s. The industry has changed completely — how can all that material still be relevant? The answer is easy: Most of it is not.

LogiGear: Also, in your opinion, what are some of the more interesting research questions people are asking now and what do you think they’ll be researching in, say, 5 years?

Professor Offutt: New technologies (like the Web) are great sources for new software testing research problems. Emergent properties like security and usability are also major growth areas for the near future. Research is like science fiction — it takes 15 or 20 years to go from research ideas to practical use; so what will we need 15 years from now? The Web is a great example. All the ideas were laid out in PhD theses and conference papers in the 1970s and 1980s, then the Web was created from those ideas in about 1990, and finally Web applications were being developed a few years later.

LogiGear: Thank you, Professor Offutt.

What about the V-Model?

The V-Model for Software Development specifies 4 kinds of testing:

  • Unit Testing
  • Integration Testing
  • System Testing
  • Acceptance Testing

You can find more information here (Wikipedia):

http://en.wikipedia.org/wiki/V-Model_%28software_development%29#Validation_Phases

What I’m finding is that of those only the Unit Testing is clear to me. The other kinds maybe good phases in a project, but for test design it doesn’t help much. It is hard to say which tests should go in a system test, an integration test or an acceptance test.

In Action Based Testing(tm) (ABT) we have a concept called “test modules”, that I feel can work much better. A test module is a collection of tests with similar scope, designed to be executed together. In practice the test module is a sheet like document with “actions”: lines specific test actions (and checks), each starting with an “action keyword” (or “action word”), followed by arguments.

In ABT we focus strongly on what we call the high level test design, in which we identify the test modules and groups of test modules. I have written about this in my articles on test design. This process is dependent on the context of the project and the system under test, and can therefore lead to different results in different circumstances. What it typically never leads to is “integration testing”, “system testing” or “acceptance testing”. It will rather be a mix of modules like “user interface tests”, “financial transactions”, “database integrity”, “security”, etc.

Once a list is established, the next step is make a schedule, with for each test module: (1) when to develop the module, and (2) when to execute it. The development scheduling typically depends on availability of specifications. for a mortgage calculation test module this will be early in a project, when the business rules are expect to be available, while for a UI test of a dialog one has to wait until the dialog has been specified in detail. For the execution scheduling generally the order reverses: tests that verify details of the user interaction need to pass before it makes sense to run tests that enter and verify financial transactions.

The result is that the V Model is still visible as a pattern in the sense that tests developed earlier are commonly executed later. However, the traditional interpretation of “integration testing”, “system testing” and “acceptance testing” is not commonly seen, and in particular does not appear to be a good starting point for the test development planning.