CS 530 - Advanced Software Engineering class notes

Reference: Sommerville, Engineering Software Products, Chapter 9

Software testing

Software testing is a process in which you execute your program using data that simulates user inputs. You observe its behavior to see whether or not your program is doing what it is supposed to do. Tests pass if the behavior is what you expect. Tests fail if the behavior differs from that expected. If your program does what you expect, this shows that for the inputs used, the program behaves correctly. If these inputs are representative of a larger set of inputs, you can infer that the program will behave correctly for all members of this larger input set.

If the behavior of the program does not match the behavior that you expect, then this means that there are bugs in your program that need to be fixed. There are two causes of program bugs:

Programming errors: You have accidentally included faults in your program code. For example, a common programming error is an 'off-by-1' error where you make a mistake with the upper bound of a sequence and fail to process the last element in that sequence.
Understanding errors: You have misunderstood or have been unaware of some of the details of what the program is supposed to do. For example, if your program processes data from a file, you may not be aware that some of this data is in the wrong format, so your program doesn't include code to handle this.

There are different types of testing:

Functional testing: Tests the functionality of the overall system. The goals of functional testing are to discover as many bugs as possible in the implementation of the system and to provide convincing evidence that the system is fit for its intended purpose.
User testing: Tests that the software product is useful to and usable by end-users. You need to show that the features of the system help users do what they want to do with the software. You should also show that users understand how to access the software's features and can use these features effectively.
Performance and load testing: Tests that the software works quickly and can handle the expected load placed on the system by its users. You need to show that the response and processing time of your system is acceptable to end-users. You also need to demonstrate that your system can handle different loads and scales gracefully as the load on the software increases.
Security testing: Tests that the software maintains its integrity and can protect user information from theft and damage.

The remainder of these notes focuses on functional testing.

Functional testing

Functional testing involves developing a large set of program tests so that, ideally, all of a program's code is executed at least once. The number of tests needed obviously depends on the size and the functionality of the application. For a business-focused web application, you may have to develop thousands of tests to convince yourself that your product is ready for release to customers. Functional testing is a staged activity in which you initially test individual units of code. You integrate code units with other units to create larger units then do more testing. The process continues until you have created a complete system ready for release.

Functional testing processes

Unit testing: The aim of unit testing is to test program units in isolation. Tests should be designed to execute all of the code in a unit at least once. Individual code units are tested by the programmer as they are developed.
Feature testing: Code units are integrated to create features. Feature tests should test all aspects of a feature. All of the programmers who contribute code units to a feature should be involved in its testing.
System testing: Code units are integrated to create a working (perhaps incomplete) version of a system. The aim of system testing is to check that there are no unexpected interactions between the features in the system. System testing may also involve checking the responsiveness, reliability, and security of the system. In large companies, a dedicated testing team may be responsible for system testing. In small companies, this is impractical, so product developers are also involved in system testing.
Release testing: The system is packaged for release to customers and the release is tested to check that it operates as expected. The software may be released as a cloud service or as a download to be installed on a customer's computer or mobile device. If DevOps is used, then the development team is responsible for release testing; otherwise, a separate team has that responsibility.

As you develop a code unit, you should also develop unit tests for that code. A code unit is anything that has a clearly defined responsibility. It is usually a function or class method but could be a module that includes a small number of other functions. Unit testing is based on a simple general principle: If a program unit behaves as expected for a set of inputs that have some shared characteristics, it will behave in the same way for a larger set whose members share these characteristics. To test a program efficiently, you should identify sets of inputs (equivalence partitions) that will be treated in the same way in your code. The equivalence partitions that you identify should not just include those containing inputs that produce the correct values. You should also identify 'incorrectness partitions' where the inputs are deliberately incorrect.

Feature testing focuses on showing that the feature functionality is implemented as expected and that the functionality meets the real needs of users. For example, if your product has a feature that allows users to login using their Google account, then you have to check that this registers the user correctly and informs them of what information will be shared with Google. You may want to check that it gives users the option to sign up for email information about your product. Normally, a feature that does several things is implemented by multiple, interacting, program units. These units may be implemented by different developers and all of these developers should be involved in the feature testing process.

Types of feature tests

Interaction tests: These test the interactions between the units that implement the feature. The developers of the units that are combined to make up the feature may have different understandings of what is required of that feature. These misunderstandings will not show up in unit tests but may only come to light when the units are integrated. The integration may also reveal bugs in program units, which were not exposed by unit testing.
Usefulness tests: These test that the feature implements what users are likely to want. For example, the developers of a login with Google feature may have implemented an opt-out default on registration so that users receive all emails from a company. They must expressly choose what type of emails that they don't want. What might be preferred is an opt-in default so that users choose what types of email they do want to receive.

System testing involves testing the system as a whole, rather than the individual system features. System testing should focus on four things: Testing to discover if there are unexpected and unwanted interactions between the features in a system. Testing to discover if the system features work together effectively to support what users really want to do with the system. Testing the system to make sure it operates in the expected way in the different environments where it will be used. Testing the responsiveness, throughput, security and other quality attributes of the system. The best way to systematically test a system is to start with a set of scenarios that describe possible uses of the system and then work through these scenarios each time a new version of the system is created. Using the scenario, you identify a set of end-to-end pathways that users might follow when using the system. An end-to-end pathway is a sequence of actions from starting to use the system for the task, through to completion of the task.

Release testing is a type of system testing where a system that's intended for release to customers is tested. The fundamental differences between release testing and system testing are: Release testing tests the system in its real operational environment rather than in a test environment. Problems commonly arise with real user data, which is sometimes more complex and less reliable than test data. The aim of release testing is to decide if the system is good enough to release, not to detect bugs in the system. Therefore, some tests that 'fail' may be ignored if these have minimal consequences for most users. Preparing a system for release involves packaging that system for deployment (e.g. in a container if it is a cloud service) and installing software and libraries that are used by your product. You must define configuration parameters such as the name of a root directory, the database size limit per user and so on.

Test automation

Automated testing is based on the idea that tests should be executable. An executable test includes the input data to the unit that is being tested, the expected result and a check that the unit returns the expected result. You run the test and the test passes if the unit returns the expected result. Normally, you should develop hundreds or thousands of executable tests for a software product. It is good practice to structure automated tests into three parts:

Arrange: Set up the system to run the test. This involves defining the test parameters and, if necessary, mock objects that emulate the functionality of code that has not yet been developed.
Action: Call the unit that is being tested with the test parameters.
Assert: Make an assertion about what should hold if the unit being tested has executed successfully. In Program 9.2, I use AssertEquals, which checks if its parameters are equal.

If you use equivalence partitions to identify test inputs, you should have several automated tests based on correct and incorrect inputs from each partition.

Generally, users access features through the product's graphical user interface (GUI). However, GUI-based testing is expensive to automate so it is best to design your product so that its features can be directly accessed through an API and not just from the user interface. Automated feature tests can then access features directly through the API without the need for direct user interaction through the system's GUI. Accessing features through an API has the additional benefit that it is possible to re-implement the GUI without changing the functional components of the software.

System testing, which should follow feature testing, involves testing the system as a surrogate user. As a system tester, you go through a process of selecting items from menus, making screen selections, inputting information from the keyboard and so on. You are looking for interactions between features that cause problems, sequences of actions that lead to system crashes and so on. Manual system testing, when testers have to repeat sequences of actions, is boring and error-prone. In some cases, the timing of actions is important and is practically impossible to repeat consistently. To avoid these problems, testing tools have been developed that can record a series of actions and automatically replay these when a system is retested.

Test-driven development

Test-driven development (TDD) is an approach to program development that is based around the general idea that you should write an executable test or tests for code that you are writing before you write the code. It was introduced by early users of the Extreme Programming agile method, but it can be used with any incremental development approach. Test-driven development works best for the development of individual program units and it is more difficult to apply to system testing. Even the strongest advocates of TDD accept that it is challenging to use this approach when you are developing and testing systems with graphical user interfaces.

TDD is a systematic approach to testing in which tests are clearly linked to sections of the program code. This means you can be confident that your tests cover all of the code that has been developed and that there are no untested code sections in the delivered code. In my view, this is the most significant benefit of TDD. The tests act as a written specification for the program code. In principle at least, it should be possible to understand what the program does by reading the tests. Debugging is simplified because, when a program failure is observed, you can immediately link this to the last increment of code that you added to the system. It is argued that TDD leads to simpler code as programmers only write code that's necessary to pass tests. They don't over-engineer their code with complex features that aren't needed.

Security testing

Security testing aims to find vulnerabilities that may be exploited by an attacker and to provide convincing evidence that the system is sufficiently secure. The tests should demonstrate that the system can resist attacks on its availability, attacks that try to inject malware and attacks that try to corrupt or steal users' data and identity. Comprehensive security testing requires specialist knowledge of software vulnerabilities and approaches to testing that can find these vulnerabilities.

A risk-based approach to security testing involves identifying common risks and developing tests to demonstrate that the system protects itself from these risks. You may also use automated tools that scan your system to check for known vulnerabilities, such as unused HTTP ports being left open. Based on the risks that have been identified, you then design tests and checks to see if the system is vulnerable. It may be possible to construct automated tests for some of these checks, but others inevitably involve manual checking of the system's behavior and its files.

Code reviews

Code reviews involve one or more people examining the code to check for errors and anomalies and discussing issues with the developer. If problems are identified, it is the developer's responsibility to change the code to fix the problems. Code reviews complement testing. They are effective in finding bugs that arise through misunderstandings and bugs that may only arise when unusual sequences of code are executed. Many software companies insist that all code has to go through a process of code review before it is integrated into the product codebase.

Useful links