How do I test this?

Strategy for test planning, test case design, and test case organization

for a DVT QA Engineer

Abstract

This paper maps out a strategy for quality assurance and testing of computer software and hardware. Testing is broken down into categories that can be used to organize cases in a test case management system, plan and manage testing work, and refine quality assurance strategy.

Meaningful categories of tests and cases are

Make it work
Make it fail
Performance measurement
Conformance to spec
Initialization
Longevity
Customer complaints
Forensic analysis
Problem history

There are critical assumptions about limited availability of product design information, proper use of a bug tracker, and arbitrating potentially conflicting designs. The conclusions are that the strategy described in this document is workable but depends on good quality assurance processes.

Audience

This paper is written by and from the perspective of a DVT (Device Validation and Test) QA Engineer. Testing devices, firmware, and low-level drivers is unique and fundamentally different from testing user interfaces, application software, and systems that interact with people rather than hardware and operating systems. The test strategies outlined in this paper are intended for application of testing hardware, devices, and firmware; rather than user interfaces or application software

Introduction

The inevitable question in a Test Engineer’s job interview is some form of “how do I test this?”. The question may take several different forms. Some questioners point to something simple such as a pencil or a phone on the table, or some other object quite unrelated to the company’s line of business. The intention is to elicit some original thinking about testing. Other interviewers will talk directly about their product.

Regardless of how the question is posed, the underlying question is really the same: How should a team go about testing a hardware or software product? What are the important and unimportant test cases? What is an intelligent way to organize test cases? What information is needed to develop test strategy and to design cases? Where is the information? What if the
information is not available? What is the role of a bug tracker in this process?

This paper offers a method for organizing test cases and getting information needed to develop test strategy. The first test steps are to find out how to make the product work and fail, measure performance, and determine conformance to specification. There are test cases focused on initialization, longevity, forensics, and customer complaints. This strategy depends on certain assumptions about process, availability of information, and design inconsistencies that are endemic to a normal organization.

Make it Work

The first step in the process of testing a product is to find out how to make it work. Products are designed and built to serve a useful purpose and no matter how incipient or unstable, it is important to know how the real behavior deviates from its intended purpose. Therefore a sequence of cases must exist to show what it takes to make the product work and behave in a stable fashion. Young products may lack significant features of intended design so tests to make it work should not be confused with functional tests or attempts to determine conformance to specification. Cases to “make it work” are especially important and relevant with products that are young, unstable, not fully developed, or have feature sets that are not fully implemented. In these cases the product will not conform to spec because the specs are not fully implemented and in this case the test objectives are to clearly define what it takes to make the product work.

Make it Fail

Once a product acquires a certain level of stability, consistency, and fidelity to design the task is to make it fail. Pedantic students of quality assurance who have more education than experience may call this something like “boundary testing”. The objective is to develop as many cases as humanly possible that induce errors, failures, crashes, or otherwise cause the product to deviate from intended purpose. Some cases inevitably cross the line of being practical or realistic, such as testing a battery powered SBC in an oven at 250 degrees but the objective is to find the perimeter lines of function and failure

Measure Performance

It is important to measure and quantify the performance and behavior of a product, starting from it’s inception. In some cases the appropriate measures are apparent such as wall-clock time to perform a given functional test, or a ‘run rate’ in the form of megabits per second. In some cases the desired measures may seem obscure. Any set of measures is better than none. The measures should be meaningful and reproducible.

The basic measures to quantify are time to start up and initialize (cold, warm, and restart), shut down, apply an upgrade, apply a patch, and perform certain tasks consistent with the product’s intended purpose. Records of performance must be maintained in order to have history and baselines of normal behavior so that newer releases, builds, and features can be intelligently assessed.

This category of performance measurement is for quality assurance purposes and must not be co-mingled with performance measurements done on behalf of marketing and product management teams who use and consume performance information for very different reasons such as competitive analysis, sales and marketing collaterals, or public relations.

Stress testing is conspicuously absent from this regime of performance measurement. Stress testing is a greatly over-rated form of quality assurance. Any fool without much thought or careful design can slap together an overwhelming workload and throw it at a hardware or software product, and very little useful information will be gained. The first thing most customers do in a product evaluation or acceptance is load it up with work, so it is best ot let them do the load tests because they are going to do it anyway.

The intelligent alternatives to stress testing are 1- a well-thought collection of performance measurements that meaningfully characterize the product and identify performance changes from one build to the next, and 2- carefully design and craft test cases that make things fail. Test cases pass all the time have no value.

Conformance to Spec

Testing for conformance to specification is a super-set of make it work. The difference is coverage. Conformance testing encompasses cases in four categories of conformance to a specification set. Conformance to spec tests exercise a product to make sure that it does what it is intended to do. The pedantic students of quality assurance who have more education than experience call this something like “functional testing”.

A closer look at exactly what and why testing for conformance to spec is important. Testing for conformance to spec falls in four discrete categories for which there must be cases:

Conformance to the specification. QA writes the spec
Conformance to written product definition
Consistency with customer documentation
Consistency with visions articulated by key product architects

The specification that QA writes represents full coverage of conformance to specification. Typically a company’s pool of product design documentation does not encompass and reflect all the intended product functions. If a company has a full and complete set of product design documents this means there are people in the company who are not busy enough. Smart companies put more effort into the design and engineering of a product than documenting all aspects of how it is intended to work.

While most college courses in quality assurance will stress the importance and value of product design documentation, and most of us would agree with this, the reality in the world of product development is that the luxury of time to fully document design specifications gives competitors more time to beat you to market. So a fact of life in quality assurance is that design documentation needed to develop comprehensive test cases is always incomplete and this is an expected result.

Because incomplete product design documentation is a fact of life in QA, an important task of quality assurance is to scan all available evidence, by reading available product documentation, and interviewing large numbers of people. Based on information from documents and interviews, test cases are written to reflect everything the product is supposed to do, and covering documented features as well as specifications that are not documented.

Product designs, and the design documentation, is almost continuously changing as products evolve and improve. Changing designs are often the subject of office political intrigue, and subject to almost continuous disagreement. QA must ignore some of this chaff, write a large number of comprehensive cases to reflect all designs, including conflicting and contradictory specifications; run the tests, write the bugs, and let the system take care of managing the issues.

Note that in this case QA is actually writing a product specification. Many practitioners and professors of quality assurance say that having quality assurance write design specifications is inviolate. But the fact is that cases must be written to achieve full coverage, including coverage of areas that are not documented that people in product design never thought of but should have.

Testing for conformance to product definition is a subset of conformance to spec. The difference is that conformance to definition uses existing and extant internal company product design documentation as the standard. This documentation would include the myriad of technical design documents that development engineers refer to in order to build the product. Some of this documentation holds company trade secrets and is typically company confidential.

Testing for conformance to customer documentation is also a relevant subset of testing conformance to spec. Test cases are built from examination of user guides, installation guides, read-me files, promotional materials, marketing collaterals, and all other customer consumable documents. The test objectives are to make sure the product does what the customer-consumable documents say it does. This work can be further sub-divided into cases reflecting post-sale customer user documentation; and pre-sales, promotional documents that customers typically study when evaluating a product for purchase.

Testing against “visions”, or what some engineers call ‘marketing’s hallucinations’ is to make sure the product does, at a high level, what critical company executives, owners, and designers originally intended to have the product do. These cases reflect the reasons why the company was incorporated in the first place. This information is typically obtained from in-person interviews.

Information sources may be design documents, private placement memos, e-mail correspondence, notes in a lab-book, or other written and verbal statements of vision, goals, objectives, and customer problems to solve. It is important to supplement documentation with personal interviews with these key players: Founders, sales and marketing executives, program managers, designers, engineers, and developers.

An expected result of studying documentation and conducting interviews is a certain amount of inconsistency. This is common because different people have different ideas about how products must be designed. The inconsistencies are reconciled by writing test cases against all elements of vision, writing the inevitable bug reports, and allowing the normal bug scrub process to sort out fact from fiction.

Initialization

From my own personal experience, observation and analysis of how a product starts, stops, and restarts is instructive and insightful into the behavior and weaknesses of a product. A well developed set of cases that cover all elements and aspects of startup and restart are an important part of test design and strategy. Therefore monitoring and observation of initialization processes is an important element of test strategy. Key elements of startup tests are

normal startup and shutdown
cold starts and restarts
shutdown and restart with special user modes such as single user or debug mode
with any tracing options that may be available such as using gcc
start from power on vs start from an operating system that is up and in a ready state
shutdown and power off vs shut down to a state where the host computer and operating system remains up
startup following cold start of the operating system
restart the software or device under test an infinite number of times and watch for failure
shutdown and restart following installation of patches, fixes, upgrades, and new releases.

Some of this depends on what is being tested, such as whether it is an application software product or a hardware system component or a device driver or an operating system.

Longevity

Like the startup tests, longevity tests, based on prior experience, seem to flush out errors, issues, and problems in computer hardware and software and therefore make excellent test cases. The test cases require dedicated resources to allow the software or device under test to run continuously, uninterrupted, and without shutdown or restart, for as long as possible. Problems to expect with longevity testing are buffer overflows, memory leaks, queue overflows, memory allocation failures, storage access and allocation failures, and security violations.

Customer Complaints

Customer complaints and expectations are important sources of information used to develop cases and strategy. There are customer statements of expectation, and problems and complaints that customers report. The help desk or technical support organization is an excellent source for information about patterns of complaints. It is important to talk directly to customers about their complaints and expectations.

Quality assurance gets this information by looking at technical support requests, talking to the help desk teams, and interviewing customers. When interviewing customers the key questions are what kinds of expectations were set by the vendor when selling the product, what did the what are the common complaints, what do you like and dislike about the product, is the documentation clear, do you read it, what did the sales and marketing teams say the product does, why did you buy it, did you get what you wanted, did you get what you paid for

Forensic Analysis

Forensics is a fancy term for all the build and revision level information associated with a product or device under test. It is especially important when the test object is a device, and it may have controllers, chipsets, PCBs, ICs, SOCs, embedded firmware, operating systems, drivers, CPUs, or other silicon components and devices that change and are subject to revision level changes. Examples would be an android phone, camera, RAID controller, SSD, motherboard, file server, computer, laptop, bluetooth radio, wireless controller, etc.

Seasoned test engineers probably all know from experience that software anomalies frequently surface after some seemingly innocuous or unrelated change in an IC, capacitor, firmware build, lens, power supply, etc. The critical tasks in this area are simply to enumerate and record all revisions of all components and making sure that changes and upgrades are recorded. Like performance information, when new anomalies surface the objective is to be able to tell whether a bug is associated with a forensic change.

Bug History

“Regression testing” is another example of the educated but inexperienced using fancy pedantic academic textbook classroom terminology for common sense quality assurance practices.

There is almost no such thing as a product that has not been tested before. Development engineers run unit tests and sanity checks immediately after the first time they get something to compile. Using historical evidence as the basis for design of test cases and strategy falls in this category and is often called regression testing.

A strict focus on development of cases based on prior bugs puts an excessively narrow perspective on opportunities to develop cases, induce failure, and write good bugs. As a first step, the repository of cases must include cases that reflect prior failures and known issues. Beyond a strict analysis of bug and defect history is the opportunity to talk with customers and development engineers about their experiences with the product as it is developed and determine key issues and patterns of errors.

Key Considerations

Dearth of critical documentation is a fact of life

In the real world requisite documents in the form of product designs, specifications, requirements, and the like, either do not exist or are conspicuously incomplete. If there is enough time at an organization to fully document product designs, specifications, and requirements it is likely you either work for the government or a very rich company; or you are oblivious to some other competitor who is rapidly eating your lunch. This means that quality assurance gets to write specifications and product guide lines, which product management really should be doing, while many consider it sinful for quality assurance to do.

Design Inconsistencies

It is normal and to be expected that different stake holders in a company will have different ideas on how a product should be designed or implemented. QA should not be in the business of arbitrating or evaluating conflicting designs. The responsibility of QA is to ascertain a design, specification, or feature; and write plans and cases to test it. Conflicting and contradictory designs mean cases that conflict and possibly contradictory bug reports. The job for QA is to write the test cases and run the tests. If there are failures or issues the job is to write the bug and run it through the process of bug scrubs. This creates a forum and process for orderly resolution of conflicting and contradictory designs.

Write a lot of bugs

While this may seem self-evident, it is important for test engineers and managers to develop a mind-set of writing as many bugs as possible. Any issue, seemingly trivial or severe, should be written up. If there is a question of whether an issue should be written up, then write the bug and let a good quality assurance process will help manage the issues.

Some managers complain about trivial bugs or place too much emphasis on the value of finding critical bugs. If a team is not writing lots of bugs then something is wrong, even if the product under test is mature and stable. Trivial bugs sometimes mask more severe underlying issues. Bug trackers allow assigning severity and priority to issues. This allows trivial issues to take care of themselves and not annoy managers who should be focused more on issues other than an abundance of trivial issues. The golden ration of bugs in QA is 4: 2: 1. For every four trivial issues there are two moderate issues and one that is critical.

If you work the system, the system works for you

The processes and strategies described in this paper can and will work effectively if a reasonable QA process is followed. This simply means writing, documenting, and substantiating good bug reports; and an orderly process for periodic bug scrubs. Bug tracker software must allow for assignment of priority and severity, communicating bug details and summary reports to key players, and regular participation by key players in scrubbing bugs, assigning work, and arbitrating design differences.

Summary and Conclusions

This paper explains a strategy for developing test cases and testing a technology product. Key elements of this test strategy are to make it work, fail, start and restart, and run perpetually; measure performance, and compare actual and intended behaviors against four different standards.

Customer complaints, problem history, and forensics are useful information resources for developing test strategy and isolating errors. Stress testing is conspicuously absent from this strategy because it has limited value. Design documentation in a normal company is typically incomplete and quality assurance is responsible for filling the information gaps that are needed to develop strategy and cases.

The strategies described in this paper are effective if normal and customary QA processes is followed: Using a bug tracker to manage issues, logging bugs, reporting and communication, participation by all stake holders, thoughtful assignment of priority and severity to issues, and attending bug scrubs to make decisions and arbitrate potentially divergent designs.