How do I test this?
Strategy for test planning, test
case design, and test case organization
for a DVT QA Engineer
Abstract
This paper maps out a strategy for quality assurance and testing
of computer software and hardware. Testing is broken down into
categories that can be used to organize cases in a test case
management system, plan and manage testing work, and refine
quality assurance strategy.
Meaningful categories of tests and cases are
- Make it work
- Make it fail
- Performance measurement
- Conformance to spec
- Initialization
- Longevity
- Customer complaints
- Forensic analysis
- Problem history
There are critical assumptions about limited availability of
product design information, proper use of a bug tracker, and
arbitrating potentially conflicting designs. The conclusions are
that the strategy described in this document is workable but
depends on good quality assurance processes.
Audience
This paper is written by and from the perspective of a DVT (Device
Validation and Test) QA Engineer. Testing devices, firmware, and
low-level drivers is unique and fundamentally different from
testing user interfaces, application software, and systems that
interact with people rather than hardware and operating systems.
The test strategies outlined in this paper are intended for
application of testing hardware, devices, and firmware; rather
than user interfaces or application software
Introduction
The inevitable question in a Test Engineer’s job interview is some
form of “how do I test this?”. The question may take several
different forms. Some questioners point to something simple such
as a pencil or a phone on the table, or some other object quite
unrelated to the company’s line of business. The intention is to
elicit some original thinking about testing. Other interviewers
will talk directly about their product.
Regardless of how the question is posed, the underlying question
is really the same: How should a team go about testing a hardware
or software product? What are the important and unimportant test
cases? What is an intelligent way to organize test cases? What
information is needed to develop test strategy and to design
cases? Where is the information? What if the
information is not available? What is the role of a bug tracker in
this process?
This paper offers a method for organizing test cases and getting
information needed to develop test strategy. The first test steps
are to find out how to make the product work and fail, measure
performance, and determine conformance to specification. There are
test cases focused on initialization, longevity, forensics, and
customer complaints. This strategy depends on certain assumptions
about process, availability of information, and design
inconsistencies that are endemic to a normal organization.
Make it Work
The first step in the process of testing a product is to find out
how to make it work. Products are designed and built to serve a
useful purpose and no matter how incipient or unstable, it is
important to know how the real behavior deviates from its intended
purpose. Therefore a sequence of cases must exist to show what it
takes to make the product work and behave in a stable fashion.
Young products may lack significant features of intended design so
tests to make it work should not be confused with functional tests
or attempts to determine conformance to specification. Cases to
“make it work” are especially important and relevant with products
that are young, unstable, not fully developed, or have feature
sets that are not fully implemented. In these cases the product
will not conform to spec because the specs are not fully
implemented and in this case the test objectives are to clearly
define what it takes to make the product work.
Make it Fail
Once a product acquires a certain level of stability, consistency,
and fidelity to design the task is to make it fail. Pedantic
students of quality assurance who have more education than
experience may call this something like “boundary testing”. The
objective is to develop as many cases as humanly possible that
induce errors, failures, crashes, or otherwise cause the product
to deviate from intended purpose. Some cases inevitably cross the
line of being practical or realistic, such as testing a battery
powered SBC in an oven at 250 degrees but the objective is to find
the perimeter lines of function and failure
Measure Performance
It is important to measure and quantify the performance and
behavior of a product, starting from it’s inception. In some cases
the appropriate measures are apparent such as wall-clock time to
perform a given functional test, or a ‘run rate’ in the form of
megabits per second. In some cases the desired measures may seem
obscure. Any set of measures is better than none. The measures
should be meaningful and reproducible.
The basic measures to quantify are time to start up and initialize
(cold, warm, and restart), shut down, apply an upgrade, apply a
patch, and perform certain tasks consistent with the product’s
intended purpose. Records of performance must be maintained in
order to have history and baselines of normal behavior so that
newer releases, builds, and features can be intelligently
assessed.
This category of performance measurement is for quality assurance
purposes and must not be co-mingled with performance measurements
done on behalf of marketing and product management teams who use
and consume performance information for very different reasons
such as competitive analysis, sales and marketing collaterals, or
public relations.
Stress testing is conspicuously absent from this regime of
performance measurement. Stress testing is a greatly over-rated
form of quality assurance. Any fool without much thought or
careful design can slap together an overwhelming workload and
throw it at a hardware or software product, and very little useful
information will be gained. The first thing most customers do in a
product evaluation or acceptance is load it up with work, so it is
best ot let them do the load tests because they are going to do it
anyway.
The intelligent alternatives to stress testing are 1- a
well-thought collection of performance measurements that
meaningfully characterize the product and identify performance
changes from one build to the next, and 2- carefully design and
craft test cases that make things fail. Test cases pass all the
time have no value.
Conformance to Spec
Testing for conformance to specification is a super-set of make it
work. The difference is coverage. Conformance testing
encompasses cases in four categories of conformance to a
specification set. Conformance to spec tests exercise a product to
make sure that it does what it is intended to do. The pedantic
students of quality assurance who have more education than
experience call this something like “functional testing”.
A closer look at exactly what and why testing for conformance to
spec is important. Testing for conformance to spec falls in four
discrete categories for which there must be cases:
- Conformance to the specification. QA writes
the spec
- Conformance to written product definition
- Consistency with customer documentation
- Consistency with visions articulated by key
product architects
The specification that QA writes represents full coverage of
conformance to specification. Typically a company’s pool of
product design documentation does not encompass and reflect all
the intended product functions. If a company has a full and
complete set of product design documents this means there are
people in the company who are not busy enough. Smart companies put
more effort into the design and engineering of a product than
documenting all aspects of how it is intended to work.
While most college courses in quality assurance will stress the
importance and value of product design documentation, and most of
us would agree with this, the reality in the world of product
development is that the luxury of time to fully document design
specifications gives competitors more time to beat you to market.
So a fact of life in quality assurance is that design
documentation needed to develop comprehensive test cases is always
incomplete and this is an expected result.
Because incomplete product design documentation is a fact of life
in QA, an important task of quality assurance is to scan all
available evidence, by reading available product documentation,
and interviewing large numbers of people. Based on information
from documents and interviews, test cases are written to reflect
everything the product is supposed to do, and covering documented
features as well as specifications that are not documented.
Product designs, and the design documentation, is almost
continuously changing as products evolve and improve. Changing
designs are often the subject of office political intrigue, and
subject to almost continuous disagreement. QA must ignore some of
this chaff, write a large number of comprehensive cases to reflect
all designs, including conflicting and contradictory
specifications; run the tests, write the bugs, and let the system
take care of managing the issues.
Note that in this case QA is actually writing a product
specification. Many practitioners and professors of quality
assurance say that having quality assurance write design
specifications is inviolate. But the fact is that cases must be
written to achieve full coverage, including coverage of areas that
are not documented that people in product design never thought of
but should have.
Testing for conformance to product definition is a subset of
conformance to spec. The difference is that conformance to
definition uses existing and extant internal company product
design documentation as the standard. This documentation would
include the myriad of technical design documents that development
engineers refer to in order to build the product. Some of this
documentation holds company trade secrets and is typically company
confidential.
Testing for conformance to customer documentation is also a
relevant subset of testing conformance to spec. Test cases are
built from examination of user guides, installation guides,
read-me files, promotional materials, marketing collaterals, and
all other customer consumable documents. The test objectives are
to make sure the product does what the customer-consumable
documents say it does. This work can be further sub-divided into
cases reflecting post-sale customer user documentation; and
pre-sales, promotional documents that customers typically study
when evaluating a product for purchase.
Testing against “visions”, or what some engineers call
‘marketing’s hallucinations’ is to make sure the product does, at
a high level, what critical company executives, owners, and
designers originally intended to have the product do. These cases
reflect the reasons why the company was incorporated in the first
place. This information is typically obtained from in-person
interviews.
Information sources may be design documents, private placement
memos, e-mail correspondence, notes in a lab-book, or other
written and verbal statements of vision, goals, objectives, and
customer problems to solve. It is important to supplement
documentation with personal interviews with these key players:
Founders, sales and marketing executives, program managers,
designers, engineers, and developers.
An expected result of studying documentation and conducting
interviews is a certain amount of inconsistency. This is common
because different people have different ideas about how products
must be designed. The inconsistencies are reconciled by writing
test cases against all elements of vision, writing the inevitable
bug reports, and allowing the normal bug scrub process to sort out
fact from fiction.
Initialization
From my own personal experience, observation and analysis of how a
product starts, stops, and restarts is instructive and insightful
into the behavior and weaknesses of a product. A well developed
set of cases that cover all elements and aspects of startup and
restart are an important part of test design and strategy.
Therefore monitoring and observation of initialization processes
is an important element of test strategy. Key elements of startup
tests are
- normal startup and shutdown
- cold starts and restarts
- shutdown and restart with special user modes
such as single user or debug mode
- with any tracing options that may be available
such as using gcc
- start from power on vs start from an operating
system that is up and in a ready state
- shutdown and power off vs shut down to a state
where the host computer and operating system remains up
- startup following cold start of the operating
system
- restart the software or device under test an
infinite number of times and watch for failure
- shutdown and restart following installation of
patches, fixes, upgrades, and new releases.
Some of this depends on what is being tested, such as whether it
is an application software product or a hardware system component
or a device driver or an operating system.
Longevity
Like the startup tests, longevity tests, based on prior
experience, seem to flush out errors, issues, and problems in
computer hardware and software and therefore make excellent test
cases. The test cases require dedicated resources to allow the
software or device under test to run continuously, uninterrupted,
and without shutdown or restart, for as long as possible. Problems
to expect with longevity testing are buffer overflows, memory
leaks, queue overflows, memory allocation failures, storage access
and allocation failures, and security violations.
Customer Complaints
Customer complaints and expectations are important sources of
information used to develop cases and strategy. There are customer
statements of expectation, and problems and complaints that
customers report. The help desk or technical support organization
is an excellent source for information about patterns of
complaints. It is important to talk directly to customers about
their complaints and expectations.
Quality assurance gets this information by looking at technical
support requests, talking to the help desk teams, and interviewing
customers. When interviewing customers the key questions are what
kinds of expectations were set by the vendor when selling the
product, what did the what are the common complaints, what do you
like and dislike about the product, is the documentation clear, do
you read it, what did the sales and marketing teams say the
product does, why did you buy it, did you get what you wanted, did
you get what you paid for
Forensic Analysis
Forensics is a fancy term for all the build and revision level
information associated with a product or device under test. It is
especially important when the test object is a device, and it may
have controllers, chipsets, PCBs, ICs, SOCs, embedded firmware,
operating systems, drivers, CPUs, or other silicon components and
devices that change and are subject to revision level changes.
Examples would be an android phone, camera, RAID controller, SSD,
motherboard, file server, computer, laptop, bluetooth radio,
wireless controller, etc.
Seasoned test engineers probably all know from experience that
software anomalies frequently surface after some seemingly
innocuous or unrelated change in an IC, capacitor, firmware build,
lens, power supply, etc. The critical tasks in this area are
simply to enumerate and record all revisions of all components and
making sure that changes and upgrades are recorded. Like
performance information, when new anomalies surface the objective
is to be able to tell whether a bug is associated with a forensic
change.
Bug History
“Regression testing” is another example of the educated but
inexperienced using fancy pedantic academic textbook classroom
terminology for common sense quality assurance practices.
There is almost no such thing as a product that has not been
tested before. Development engineers run unit tests and sanity
checks immediately after the first time they get something to
compile. Using historical evidence as the basis for design of test
cases and strategy falls in this category and is often called
regression testing.
A strict focus on development of cases based on prior bugs puts an
excessively narrow perspective on opportunities to develop cases,
induce failure, and write good bugs. As a first step, the
repository of cases must include cases that reflect prior failures
and known issues. Beyond a strict analysis of bug and defect
history is the opportunity to talk with customers and development
engineers about their experiences with the product as it is
developed and determine key issues and patterns of errors.
Key Considerations
Dearth of critical documentation is a fact of
life
In the real world requisite documents in the form of product
designs, specifications, requirements, and the like, either do not
exist or are conspicuously incomplete. If there is enough time at
an organization to fully document product designs, specifications,
and requirements it is likely you either work for the government
or a very rich company; or you are oblivious to some other
competitor who is rapidly eating your lunch. This means that
quality assurance gets to write specifications and product guide
lines, which product management really should be doing, while many
consider it sinful for quality assurance to do.
Design Inconsistencies
It is normal and to be expected that different stake holders in a
company will have different ideas on how a product should be
designed or implemented. QA should not be in the business of
arbitrating or evaluating conflicting designs. The responsibility
of QA is to ascertain a design, specification, or feature; and
write plans and cases to test it. Conflicting and contradictory
designs mean cases that conflict and possibly contradictory bug
reports. The job for QA is to write the test cases and run the
tests. If there are failures or issues the job is to write the bug
and run it through the process of bug scrubs. This creates a forum
and process for orderly resolution of conflicting and
contradictory designs.
Write a lot of bugs
While this may seem self-evident, it is important for test
engineers and managers to develop a mind-set of writing as many
bugs as possible. Any issue, seemingly trivial or severe, should
be written up. If there is a question of whether an issue should
be written up, then write the bug and let a good quality assurance
process will help manage the issues.
Some managers complain about trivial bugs or place too much
emphasis on the value of finding critical bugs. If a team is not
writing lots of bugs then something is wrong, even if the product
under test is mature and stable. Trivial bugs sometimes mask more
severe underlying issues. Bug trackers allow assigning severity
and priority to issues. This allows trivial issues to take care of
themselves and not annoy managers who should be focused more on
issues other than an abundance of trivial issues. The golden
ration of bugs in QA is 4: 2: 1. For every four trivial
issues there are two moderate issues and one that is critical.
If you work the system, the system works for you
The processes and strategies described in this paper can and will
work effectively if a reasonable QA process is followed. This
simply means writing, documenting, and substantiating good bug
reports; and an orderly process for periodic bug scrubs. Bug
tracker software must allow for assignment of priority and
severity, communicating bug details and summary reports to key
players, and regular participation by key players in scrubbing
bugs, assigning work, and arbitrating design differences.
Summary and Conclusions
This paper explains a strategy for developing test cases and
testing a technology product. Key elements of this test strategy
are to make it work, fail, start and restart, and run perpetually;
measure performance, and compare actual and intended behaviors
against four different standards.
Customer complaints, problem history, and forensics are useful
information resources for developing test strategy and isolating
errors. Stress testing is conspicuously absent from this strategy
because it has limited value. Design documentation in a normal
company is typically incomplete and quality assurance is
responsible for filling the information gaps that are needed to
develop strategy and cases.
The strategies described in this paper are effective if normal and
customary QA processes is followed: Using a bug tracker to manage
issues, logging bugs, reporting and communication, participation
by all stake holders, thoughtful assignment of priority and
severity to issues, and attending bug scrubs to make decisions and
arbitrate potentially divergent designs.