Software Testing


Even with the best validation, it’s very hard to achieve perfect quality in software. Here are some typical residual defect rates (bugs left over after the software has shipped) per kloc (one thousand lines of source code):

  • 1 – 10 defects/kloc: Typical industry software.
  • 0.1 – 1 defects/kloc: High-quality validation. The Java libraries might achieve this level of correctness.
  • 0.01 – 0.1 defects/kloc: The very best, safety-critical validation. NASA and companies like Praxis can achieve this level.

This can be discouraging for large systems. For example, if you have shipped a million lines of typical industry source code (1 defect/kloc), it means you missed 1000 bugs!

Here are some approaches that unfortunately don’t work well in the world of software.

Exhaustive testing is infeasible. The space of possible test cases is generally too big to cover exhaustively. Imagine exhaustively testing a 32-bit floating-point multiply operation, a*b. There are 2^64 test cases!

Haphazard testing (“just try it and see if it works”) is less likely to find bugs, unless the program is so buggy that an arbitrarily-chosen input is more likely to fail than to succeed. It also doesn’t increase our confidence in program correctness.

Random or statistical testing doesn’t work well for software. Other engineering disciplines can test small random samples (e.g. 1% of hard drives manufactured) and infer the defect rate for the whole production lot. Physical systems can use many tricks to speed up time, like opening a refrigerator 1000 times in 24 hours instead of 10 years. These tricks give known failure rates (e.g. mean lifetime of a hard drive), but they assume continuity or uniformity across the space of defects. This is true for physical artifacts.

But it’s not true for software. Software behavior varies discontinuously and discretely across the space of possible inputs. The system may seem to work fine across a broad range of inputs, and then abruptly fail at a single boundary point. The famous Pentium division bug affected approximately 1 in 9 billion divisions. Stack overflows, out of memory errors, and numeric overflow bugs tend to happen abruptly, and always in the same way, not with probabilistic variation. That’s different from physical systems, where there is often visible evidence that the system is approaching a failure point (cracks in a bridge) or failures are distributed probabilistically near the failure point (so that statistical testing will observe some failures even before the point is reached).

Test-first Programming

Test early and often, test it as if you want to make it fail (Test Driven), don’t treat the code as something precious. Don’t leave testing until the end, when you have a big pile of unvalidated code. Leaving testing until the end only makes debugging longer and more painful, because bugs may be anywhere in your code. It’s far more pleasant to test your code as you develop it.

In test-first-programming, you write tests before you even write any code. The development of a single function proceeds in this order:

  1. Write a specification for the function.
  2. Write tests that exercise the specification.
  3. Write the actual code. Once your code passes the tests you wrote, you’re done.

The specification describes the input and output behavior of the function. It gives the types of the parameters and any additional constraints on them (e.g. sqrt’s parameter must be nonnegative). It also gives the type of the return value and how the return value relates to the inputs. You’ve already seen and used specifications on your problem sets in this class. In code, the specification consists of the method signature and the comment above it that describes what it does. We’ll have much more to say about specifications a few classes from now.

Writing tests first is a good way to understand the specification. The specification can be buggy, too — incorrect, incomplete, ambiguous, missing corner cases. Trying to write tests can uncover these problems early, before you’ve wasted time writing an implementation of a buggy spec.

Choosing Test Cases by Partitioning

Creating a good test suite is a challenging and interesting design problem. We want to pick a set of test cases that is small enough to run quickly, yet large enough to validate the program.

partitioning a function's input space

To do this, we divide the input space into subdomains, each consisting of a set of inputs. Taken together the subdomains completely cover the input space, so that every input lies in at least one subdomain. Then we choose one test case from each subdomain, and that’s our test suite.

The idea behind subdomains is to partition the input space into sets of similar inputs on which the program has similar behaviour. Then we use one representative of each set. This approach makes the best use of limited testing resources by choosing dissimilar test cases, and forcing the testing to explore parts of the input space that random testing might not reach.

We can also partition the output space into subdomains (similar outputs on which the program has similar behaviour) if we need to ensure our tests will explore different parts of the output space. Most of the time, partitioning the input space is sufficient.

Example: BigInteger.multiply()

Let’s look at an example. BigInteger is a class built into the Java library that can represent integers of any size, unlike the primitive types int and long that have only limited ranges. BigInteger has a method multiply that multiplies two BigInteger values together (this is an instance method, hence the use of this):

 * @param val another BigInteger
 * @return a BigInteger whose value is (this * val).
public BigInteger multiply(BigInteger val)

For example, here’s how it might be used:

BigInteger a = ...;
BigInteger b = ...;
BigInteger ab = a.multiply(b);

This example shows that even though only one parameter is explicitly shown in the method’s declaration, multiply is actually a function of two arguments: the object you’re calling the method on (a in the example above), and the parameter that you’re passing in the parentheses (b in this example). In Python, the object receiving the method call would be explicitly named as a parameter called self in the method declaration. In Java, you don’t mention the receiving object in the parameters, and it’s called this instead of self.

So we should think of multiply as a function taking two inputs, each of type BigInteger, and producing one output of type BigInteger:

multiply : BigInteger × BigInteger → BigInteger

So we have a two-dimensional input space, consisting of all the pairs of integers (a,b). Now let’s partition it. Thinking about how multiplication works, we might start with these partitions:

  • a and b are both positive
  • a and b are both negative
  • a is positive, b is negative
  • a is negative, b is positive

There are also some special cases for multiplication that we should check: 0, 1, and -1.

  • a or b is 0, 1, or -1

Finally, as a suspicious tester trying to find bugs, we might suspect that the implementor of BigInteger might try to make it faster by using int or long internally when possible, and only fall back to an expensive general representation (like a list of digits) when the value is too big. So we should definitely also try integers that are very big, bigger than the biggest long.

  • a or b is small
  • the absolute value of a or b is bigger than Long.MAX_VALUE, the biggest possible primitive integer in Java, which is roughly 2^63.

Let’s bring all these observations together into a straightforward partition of the whole (a,b) space. We’ll choose a and b independently from:

partitioning multiply()
  • 0
  • 1
  • -1
  • small positive integer
  • small negative integer
  • huge positive integer
  • huge negative integer

So this will produce 7 × 7 = 49 partitions that completely cover the space of pairs of integers.

To produce the test suite, we would pick an arbitrary pair (a,b) from each square of the grid, for example:

  • (a,b) = (-3, 25) to cover (small negative, small positive)
  • (a,b) = (0, 30) to cover (0, small positive)
  • (a,b) = (2^100, 1) to cover (large positive, 1)
  • etc.

The figure at the right shows how the two-dimensional (a,b) space is divided by this partition, and the points are test cases that we might choose to completely cover the partition.

Example: max()

Let’s look at another example from the Java library: the integer max() function, found in the Math class.

 * @param a  an argument
 * @param b  another argument
 * @return the larger of a and b.
public static int max(int a, int b)

Mathematically, this method is a function of the following type:

max : int × int → int


From the specification, it makes sense to partition this function as:

  • a < b
  • a = b
  • a > b

Our test suite might then be:

  • (a, b) = (1, 2) to cover a < b
  • (a, b) = (9, 9) to cover a = b
  • (a, b) = (-5, -6) to cover a > b

Include Boundaries in the Partition

Bugs often occur at boundaries between subdomains. Some examples:

  • 0 is a boundary between positive numbers and negative numbers
  • the maximum and minimum values of numeric types, like int and double
  • emptiness (the empty string, empty list, empty array) for collection types
  • the first and last element of a collection

Why do bugs often happen at boundaries? One reason is that programmers often make off-by-one mistakes (like writing <= instead of <, or initializing a counter to 0 instead of 1). Another is that some boundaries may need to be handled as special cases in the code. Another is that boundaries may be places of discontinuity in the code’s behavior. When an int variable grows beyond its maximum positive value, for example, it abruptly becomes a negative number (discontinuous behaviour).

It’s important to include boundaries as subdomains in your partition, so that you’re choosing an input from the boundary.

Let’s redo max : int × int → int.

Partition into:

  • relationship between a and b
    • a < b
    • a = b
    • a > b
  • value of a
    • a = 0
    • a < 0
    • a > 0
    • a = minimum integer
    • a = maximum integer
  • value of b
    • b = 0
    • b < 0
    • b > 0
    • b = minimum integer
    • b = maximum integer

Now let’s pick test values that cover all these classes:

  • (1, 2) covers a < b, a > 0, b > 0
  • (-1, -3) covers a > b, a < 0, b < 0
  • (0, 0) covers a = b, a = 0, b = 0
  • (Integer.MIN_VALUE, Integer.MAX_VALUE) covers a < b, a = minint, b = maxint
  • (Integer.MAX_VALUE, Integer.MIN_VALUE) covers a > b, a = maxint, b = minint

Two Extremes for Covering the Partition

After partitioning the input space, we can choose how exhaustive we want the test suite to be:

  • Full Cartesian product.
    Every legal combination of the partition dimensions is covered by one test case. This is what we did for the multiply example, and it gave us 7 × 7 = 49 test cases. For the max example that included boundaries, which has three dimensions with 3 parts, 5 parts, and 5 parts respectively, it would mean up to 3 × 5 × 5 = 75 test cases. In practice not all of these combinations are possible, however. For example, there’s no way to cover the combination a < b, a=0, b=0, because a can’t be simultaneously less than zero and equal to zero.
  • Cover each part.
    Every part of each dimension is covered by at least one test case, but not necessarily every combination. With this approach, the test suite for max might be as small as 5 test cases if carefully chosen. That’s the approach we took above, which allowed us to choose 5 test cases.

Black box and White box Testing

Recall from above that the specification is the description of the function’s behavior — the types of parameters, type of return value, and constraints and relationships between them.

Black box testing means choosing test cases only from the specification, not the implementation of the function. That’s what we’ve been doing in our examples so far. We partitioned and looked for boundaries in multiply and max without looking at the actual code for these functions.

White box testing (also called glass box testing) means choosing test cases with knowledge of how the function is actually implemented. For example, if the implementation selects different algorithms depending on the input, then you should partition according to those domains. If the implementation keeps an internal cache that remembers the answers to previous inputs, then you should test repeated inputs.  Trigger each path / algorithm in the function.

When doing white box testing, you must take care that your test cases don’t require specific implementation behaviour that isn’t specifically called for by the spec. For example, if the spec says “throws an exception if the input is poorly formatted,” then your test shouldn’t check specifically for a NullPointerException just because that’s what the current implementation does. The specification in this case allows any exception to be thrown, so your test case should likewise be general to preserve the implementor’s freedom. Tests should always respect the specification.


One way to judge a test suite is to ask how thoroughly it exercises the program. This notion is called coverage. Here are three common kinds of coverage:

  • Statement coverage: is every statement run by some test case?
  • Branch coverage: for every if or while statement in the program, are both the true and the false direction taken by some test case?
  • Path coverage: is every possible combination of branches — every path through the program — taken by some test case?

Branch coverage is stronger (requires more tests to achieve) than statement coverage, and path coverage is stronger than branch coverage. In industry, 100% statement coverage is a common goal, but even that is rarely achieved due to unreachable defensive code (like “should never get here” assertions). 100% branch coverage is highly desirable, and safety critical industry code has even more arduous criteria (e.g., “MCDC,” modified decision/condition coverage). Unfortunately 100% path coverage is infeasible, requiring exponential-size test suites to achieve.

A standard approach to testing is to add tests until the test suite achieves adequate statement coverage: i.e., so that every reachable statement in the program is executed by at least one test case. In practice, statement coverage is usually measured by a code coverage tool, which counts the number of times each statement is run by your test suite. With such a tool, white box testing is easy; you just measure the coverage of your black box tests, and add more test cases until all important statements are logged as executed.

EclEmma code coverage tool for Eclipse

A good code coverage tool for Eclipse is EclEmma, shown on the right.

Lines that have been executed by the test suite are coloured green, and lines not yet covered are red. If you saw this result from your coverage tool, your next step would be to come up with a test case that causes the body of the while loop to execute, and add it to your test suite so that the red lines become green.


Collections vs Fixed Arrays

Instead of a fixed-length array, let’s use the List type. Lists are variable-length sequences of another type T. Here’s how we can declare a List variable and make a list value:

List<Integer> list = new ArrayList<Integer>();

And here are some of its operations:

  • indexing: list.get(2)
  • assignment: list.set(2, 0)
  • length: list.size()

Note that List is an interface (similar to an abstract data type), a type that can’t be constructed directly with new, but that instead specifies the operations that a List must provide. We’ll talk about this notion in a future class on abstract data types. ArrayList is a class, a concrete type that provides implementations of those operations. ArrayList isn’t the only implementation of the List type, though it’s the most commonly used one. LinkedList is another. Check them out in the Java API documentation, which you can find by searching the web for “Java 8 API”. Get to know the Java API docs, they’re your friend. (“API” means “application programmer interface,” and is commonly used as a synonym for “library.”)

Note also that we wrote List<Integer> instead of List<int>. Unfortunately we can’t write List<int> in direct analog to int[]. Lists only know how to deal with object types, not primitive types. In Java, each of the primitive types (which are written in lowercase and often abbreviated, like int) has an equivalent object type (which is capitalized, and fully spelled out, like Integer). Java requires us to use these object type equivalents when we parameterize a type with angle brackets. But in other contexts, Java automatically converts between int and Integer, so we can write Integer i = 5 without any type error.

Here’s the hailstone code written with Lists:

List<Integer> list = new ArrayList<Integer>();
int n = 3;
while (n != 1) {
    if (n % 2 == 0) {
        n = n / 2;
    } else {
        n = 3 * n + 1;

Not only simpler but safer too, because the List automatically enlarges itself to fit as many numbers as you add to it (until you run out of memory, of course).


A for loop steps through the elements of an array or a list, just as in Python, though the syntax looks a little different. For example:

// find the maximum point of a hailstone sequence stored in list
int max = 0;
for (int x : list) {
    max = Math.max(x, max);

You can iterate through arrays as well as lists. The same code would work if the list were replaced by an array.

Math.max() is a handy function from the Java API. The Math class is full of useful functions like this – search for “java 8 Math” on the web to find its documentation.



sudo apt-get install subversion libapache2-svn
sudo vi /etc/apache2/mods-available/dav_svn_conf

Fibonacci Algorithm

Naive Fibonacci Algorithm

 public static long fib(int N) {
     if (N == 0) return 0;
     if (N == 1) return 1;
     return F(N-1) + F(N-2);

Dynamic Programming – Recursion + Memoization

Dynamic Programming – careful brute force, break it into sub problems, solve the sub problems and reuse the solution.  Richard Belman invented Dynamic Programming as a means to explain mathematical research he was conducting.

Naive recursive algorithm as above, exponential time, running time as recurrance:

T(n) = T(n-1) + T(n-2) + O(1)
T(n) >= 2T(n-2) 
  which is of the order of 2 to the power of n/2

Memoized DP algorithm:  if we need to compute fib(n), if n in memo, return memo[n], else compute fib(n) and store it.  memo is initially an empty dictionary.  Then fib(k) only recurses the first time it’s called for all k.  Memoized calls cost constant time.  Number of non-memoized calls is n, non recursive work per call is constant.  Therefore running time is linear  The best algorithm for calculating Fibonacci numbers is time proportional to log(n).

# best time achieved by resolving subproblems in advance
fib = {}
for k in range(1, n+1):
   [ if k<= 2: f=1
   [ else: f=fib[k-1] + fib[k-2]
return fib[n]

import java.util.HashMap;

public class Fibonacci {
  static HashMap<Integer, Long> fibMap = 
                             new HashMap<Integer, Long>();
  // for n >= 1
  public static long fib(int n) {
    if (n <= 2) return 1; // quicker than lookup
    // look up in map, returns null if key doesn't exist
    if (fibMap.containsKey(n)) return fibMap.get(n);
    long f = fib(n-1) + fib(n-2);
    // store new fib in map
    fibMap.put(n, f);
    return f;

  public static void main (String[] args) {
    for (int i = 0; i < 92; i ++) {
      System.out.println(i + " : " + fib(i));

Algorithms, Recursion and Data-Structures

Princeton Algorithm Courses

One of the problems I had with setting up the jar file algs4.jar was that the CLASSPATH variable needs to include the directory where your Java programs .class file is. You can include ‘.’ in CLASSPATH to indicate that the current directory should be included.  Then I followed the instructions on the book site. and modified my .bash_profile as follows.

# Add algs.jar to classpath
export CLASSPATH=$CLASSPATH:~/algs4/algs4.jar

Interesting Tricks

 // int N to binary representation
 String s = "";
 for (int n = N; n > 0; n /= 2)
     s = (n % 2) + s;
 // log base 2
 public static int lg(int N) {
     if (N/2 == 1) return 1;
     return 1 + lg(N/2);

Recursive Algorithms

A method can call itself.  There are three important rules of thumb in developing recursive programs:

  1. The recursion has a base case – always includes a conditional statements as the first statement in the program that has a return.
  2. Recursive calls must address subprograms that are smaller in some sense, so that recursive calls converge to the base case.
  3. Recursive calls should not address subprograms that overlap.

Greatest Common Divisor

Euclid’s Algorithm – compute the greatest common divisor of two nonnegative integers p and q as follows: If q is 0, the answer is p.  If not, divide p by q and take the remainder r.  The answer is the greatest common divisor of q and r.

public static int gcd(int p, int q) {
    if (q-==0) return p;
    int r = p % q;
    return (gcd(q, r);

Binary Search

// recursive implementation of binary search
 public static int rank(int key, int[] a0) {
 return rank(key, a, 0, a.length -1);

public static int rank(int key, int[] a, int lo, int hi) {
 // index of key in a[], if present is not smaller than lo,
 // and not larger than hi
 if (lo > hi) return -1;
 int mid = lo + (hi - lo) / 2;
 if (key < a[mid]) return rank (key, a, lo, mid -1);
 else if (key > a[mid]) return rank (key, a, mid + 1, hi);
 else return mid;

Simple Mathematical Functions

 // log base 2
 public static int lg(int N) {
   if (N/2 == 1) return 1;
   return 1 + lg(N/2);

 // natural logarithm of n factorial
 // ln(xy) = ln (x) + ln (y)
 public static int lnFact(int n) {
   if (n == 1) return 0;
   return Math.log(n) + lnFact(n-1);

 //  n factorial
 public static long nFact(int n) {
   if (n == 1) return 1;
   return n * nFact(n-1);

Other Links


Steps in drawing out the recursive call.

How to write a method that calls itself. – Dave Feinberg’s video series.

  1. Write “if”.
    There must be at least 2 cases:, a recursive case (where the method calls itself) and a base case (where the method does not).
  2. Handle the simplest case(s). “Base Case”
    Simplest = no recursive call needed (no further loop)
  3. Write the recursive call(s).
    On the next simpler input/state, it may help to store the recursive call
  4. Assume the recursive call works .
    Ask yourself: What does it do?
    Ask yourself: How does it help? 



Base Case

If n! is defined as the product of all positive integers from 1 to n, then:
1! = 1*1 = 1
2! = 1*2 = 2
3! = 1*2*3 = 6
4! = 1*2*3*4 = 24

n! = 1*2*3*…*(n-2)*(n-1)*n
and so on.
Logically, n! can also be expressed n*(n-1)! .

Therefore, at n=1, using n! = n*(n-1)!
1! = 1*0!
which simplifies to 1 = 0!

Java Recursive Factorial Method

// precondition: n >= 0
public static in fact(int n) {
    if (n=0) return 1;  
    else return fact(n-1) * n; 

Good Coding Practice

3 key principles of good code include being reusable, maintainable and easy to change.  There are numerous ways to uphold and ensure these principles.

Steps to Writing Error-free Code

  • Understand the difference between syntax errors and logic errors.  Syntax involves the ‘spelling and grammar’ of programming, these are generally picked up by compilers.  Check your variable names, brackets and punctuation.  Autocomplete in modern IDEs helps reduce syntax errors.
  • Syntax errors that are more like semantic errors are harder to spot and will not be picked up by a compiler, for example when you type the wrong variable name by mistake, you could then end up with the wrong answer.   It is helpful to use meaningful names for your variables and functions.
  • Logic errors often lead to unexpected results, and are seldom caught by a compiler.  They can come about when statements are executed in the wrong order, incorrect boolean expression and decision points, wrong data types, etc.  To eliminate logic errors, it helps to write out pseudocode before you start coding to help get the sequence right.  Check that against the specification, are you doing what you’re supposed to be doing?
  • Document code clearly, it helps keep code easy to read and understand, especially when it comes to debugging.  It is easier to spot bugs in well organised code.  When you come back to your code, it will be easier to know how it works. Using a coding convention helps get the balance of white space right. This helps with code readibility.
  • Initialise variables, consider the boundary cases of data types chosen.  Avoid mutable variables and consider the use of null values carefully.  A method on call on null will result in a runtime exception.  Consider array indices and use flexible data structures if required.
  • Write modular code that can be tested in isolation so that the source of errors can be identified more easily.  This is improved with abstraction and encapsulation which together enables unit testing.
  • Error avoidance by checking preconditions/requirements are met on the client side, only execute the code if the precondition is true,  isResourceAvailable(), if returns true then proceed.  But this isn’t always reliable, the client can’t always be trusted.

Exceptions – defensive programming

Defensive programming assumes that the client can’t be trusted and checks argument values.  Exceptions, used correctly help software be robust and recover from errors at run-time.  Run-time errors are often related to the environment or resources, URL changed, network down, file deleted, printer switched off… These cannot be anticipated, but can be recovered from by handling the exceptions appropriately.  It should not fail silently, outputting  a message or code may not be seen or understood.  Thowing an exception means that classes can be more cohesive and leave other classes to fix problems that are in their domain, e.g. leave I/O to classes concerned with I/O.  Otherwise you bind functionality to technology, which may soon be out of date.

An Exception is a software signal which can be thrown or raised and also caught and handled, either in the same method or higher up.  There are loosely three categories, Checked Exceptions (have to be caught or specify method may throw them, Unchecked Exceptions (often a bug in the program, not obliged to catch them) and Runtime Exceptions.  Checked in this context means that the compiler checks to see that you catch or throw these exceptions, e.g. FileNotFoundException.



Kippo on Kali on Raspberry Pi

Kali for Raspberry Pi

Installing Kippo

Note, Kippo requires an older version of python twister.  It fails to start if twister 16.x is installed.  All worked fine with version 14.02.

sudo apt-get install python-mysqldb mysql-server apache2
# download kippo via github 
git clone

Create a database called kippo, and database user called kippo.  Then add some the necessary database tables.

mysql -u root -p  // enter root user password
create kippo;
GRANT ALL ON kippo.* TO 'kippo'@'localhost' IDENTIFIED BY '**';
mysql -u kippo -p // enter database user password
use kippo;
source mysql.sql;
show tables;

Copy the sample kippo.cfg file and uncomment and edit the lines relating to [database_mysql].  Then create a user to start kippo, and grant the user access to the folder.

# edit kippo.cfg as follows
host = localhost
database = kippo
username = kippo
password = kippo_db_user_passwd

# kippo cannot be run as root user
useradd -d /home/kippo -s /bin/bash -m kippo -g sudo
# grant kippo user privileges
chown -R kippo /usr/local/src/kippo/

Owing to some issues, install python twisted manually.

sudo apt-get install python-dev python-pip
cd /tmp
cd twisted-twisted-14.0.2/
./ install
su kippo cd /usr/local/kippo/
# to stop kippo, try to restart it, then kill the pid

The root password for the kippo honeypot is 123456, change this by editing the /kippo /data/userdb.txt file and restarting kippo.

Kippo defaults to listen for SSH connections on port 2222.  Using iptables, direct all connections to port 22 to kippo which is listening on port 2222, then direct set the default port for SSH to a different port.

apt-get install iptables
iptables -t nat -A PREROUTING -p tcp --dport 22 -j REDIRECT --to-port 2222
iptables-save > /etc/iptables.rules
# set the default port for ssh to 65534
sed -i 's:Port 22:Port 65534:g' /etc/ssh/sshd_config
# restart ssh
/etc/init.d/ssh restart


While I’ve followed the steps to install KippoGraph, it is currently throwing an exception originating in rb.php:761, so for now I’m just looking at the logs via script files.

Useful Links