Theories in JUnit

I’ve recently become aware of two new features that were introduced in JUnit 4.4 called Theories and Assumptions. I have yet to see either concept used in my day-to-day life as a programmer, so I’ll shed some light on them in an attempt to raise awareness of their existence. Keep in mind that Theories are still very much an experimental part of JUnit, even in JUnit 4.12.

What are Theories?

A Theory is similar to a regular JUnit Test, but with one important difference: the Theory method can accept parameters.

Below is a brief code example to that I will explain line by line to get us started. It asserts that the standard Java method Math.abs(int) (JavaDoc) returns the correct value both when passed a positive value and a negative value.

package com.thomaskasene.theory;

import org.junit.experimental.theories.DataPoint;
import org.junit.experimental.theories.Theories;
import org.junit.experimental.theories.Theory;
import org.junit.runner.RunWith;

import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

@RunWith(Theories.class)
public class MathAbsoluteTest {

    @DataPoint
    public static final int NEGATIVE_VALUE = -2;

    @DataPoint
    public static final int POSITIVE_VALUE = 2;

    @Theory
    public void returnsAbsoluteValue(int value) {
        int result = Math.abs(value);
        assertThat(result, is(2));
        System.out.printf("The absolute value of %d is %d%n", value, result);
    }
}

I mentioned in the introduction that Theories are still an experimental feature, and the list of imports in the code above shows this. All the relevant classes – DataPoint, Theories and Theory – are inside the package org.junit.experimental.theories.

The next point of interest is the @RunWith annotation. It’s provided with a special test runner class called Theories, which is required for the Theory to be run.

Next up are the data points. I’ll talk more about these a bit later, but for now, consider them the test data that is being passed to the Theory method.

Finally, we have the Theory itself. Theories are annotated with @Theory instead of @Test, and they can define parameters. Apart from that, they look very much the same as standard JUnit Tests. The interesting thing about Theories is that they will run once for each valid combination of data points. In this case, it means that our Theory will run twice: one time where value is -2 and one time where it’s 2.

The last statement simply prints some information to help us see what’s going on when we run the Theory. I will continue to do this in the other code examples in this post, even though it’s generally considered a bad practice and should not be checked into your source control management system.

Valid Combinations of Data Points

When run, the code above will print the following to the output stream:

The absolute value of -2 is 2
The absolute value of 2 is 2

As the output clearly shows, the method was run twice, one time for each valid combination of data points. As this particular Theory only accepts a single parameter, the combinations aren’t many. To illustrate a more complex case, I’ve written up another example which tests the functionality of String.format(String, Object...) (JavaDoc).

package com.thomaskasene.theory;

import org.junit.experimental.theories.DataPoint;
import org.junit.experimental.theories.Theories;
import org.junit.experimental.theories.Theory;
import org.junit.runner.RunWith;

import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

@RunWith(Theories.class)
public class StringFormatTest {

    @DataPoint
    public static final String NAME_PETER = "Peter";

    @DataPoint
    public static final String NAME_BRUCE = "Bruce";

    @DataPoint
    public static final int AGE_18 = 18;

    @DataPoint
    public static final int AGE_25 = 25;

    @Theory
    public void returnsFormattedString(String name, int age) {
        String expected = name + " is " + age + " years old";
        String result = String.format("%s is %d years old", name, age);
        assertThat(result, is(equalTo(expected)));
        System.out.println(result);
    }
}

As you can see from the output below, the Theory has been run once for every valid combination of data points. As far as I’m aware there are no guarantees as to the order in which these data points are passed to the Theory, so you should never write your Theories in a way that expects a certain order.

Bruce is 18 years old
Bruce is 25 years old
Peter is 18 years old
Peter is 25 years old

Difference Between @DataPoint and @DataPoints

There are two annotations that can be used to provide a Theory with data points: @DataPoint and @DataPoints. So far in this post I’ve only used the former in my code samples. @DataPoints is very similar, but it assumes that the datatype it annotates is an array or an iterable type like List. It allows us to list multiple data points in a single declaration.

    @DataPoint
    public static final int NEGATIVE_VALUE = -2;

    @DataPoint
    public static final int POSITIVE_VALUE = 2;

    @Theory
    public void returnsAbsoluteValue(int value) {
        int result = Math.abs(value);
        assertThat(result, is(2));
    }
    @DataPoints
    public static final int[] VALUES = {-2, 2};

    @Theory
    public void returnsAbsoluteValue(int value) {
        int result = Math.abs(value);
        assertThat(result, is(2));
    }

The two code snippets above are equivalent; we don’t need to make any changes to the Theory to make it work. We can use the same approach to simplify the StringFormatTest class as well:

    @DataPoints
    public static final String[] NAMES = {"Peter", "Bruce"};

    @DataPoints
    public static final int[] AGES = {18, 25};

    @Theory
    public void returnsFormattedString(String name, int age) {
        String expected = name + " is " + age + " years old";
        String result = String.format("%s is %d years old", name, age);
        assertThat(result, is(equalTo(expected)));
    }

We have to pay close attention to which annotation we’re using when we introduce arrays and iterable types. As the example below shows, we might accidentally register an array or iterable type as a @DataPoint, when we actually meant @DataPoints. In this case, it’ll result in a runtime exception because the Theory expects an int, but no int data points were found (the only data point is of type int[]).

    @DataPoint
    public static final int[] VALUES = {-2, 2};

    @Theory
    public void returnsAbsoluteValue(int value) {
        int result = Math.abs(value);
        assertThat(result, is(2));
    }

Narrowing Down Data Points

As mentioned before, the default behavior of the Theories runner is to test a Theory using all valid combinations of data points. This can sometimes be more than we want, and a typical example is when we have multiple Theories that each test different methods that accept the same types of parameters. There are at least two ways of doing this cleanly, and I’ll provide examples for both of them in the subsections below.

Using @FromDataPoints

@FromDataPoints is an annotation that can be applied to the parameters of a Theory to tell it which data points it should care about. When we define a data point, we can give it a name by setting the annotation’s value. Then, we can annotate a Theory parameter with @FromDataPoints, and pass the name to its value as well. If a Theory can’t find the data points it’s been set to use, it will throw an exception during runtime. If it can find them, only they will be used for that particular parameter.

package com.thomaskasene.theory;

import org.junit.experimental.theories.DataPoints;
import org.junit.experimental.theories.FromDataPoints;
import org.junit.experimental.theories.Theories;
import org.junit.experimental.theories.Theory;
import org.junit.runner.RunWith;

import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

@RunWith(Theories.class)
public class StringSplitAndCharAtTest {

    @DataPoints("TwoWords")
    public static final String[] TWO_WORDS = {"Hello World", "Good morning", "What's up"};

    @DataPoints("StartingWithT")
    public static final String[] STARTING_WITH_T = {"Thor", "Tests", "This"};

    @Theory
    public void returnsTwoItemsWhenPassedTwoWords(@FromDataPoints("TwoWords") String value) {
        int result = value.split(" ").length;
        assertThat(result, is(2));
        System.out.printf("The string \"%s\" is split into %d parts%n", value, result);
    }

    @Theory
    public void returnsTWhenValueStartsWithT(@FromDataPoints("StartingWithT") String value) {
        char result = value.charAt(0);
        assertThat(result, is('T'));
        System.out.printf("The string \"%s\" begins with the letter '%c'%n", value, result);
    }
}

In the code above we’re doing some basic testing of the functionality of String.split(String) (JavaDoc) and String.charAt(int) (JavaDoc).

In the first Theory we check that when a string with two space-separated words is split on space, the resulting array contains two elements. Notice how the data point name, TwoWords, is passed both to @DataPoints and to @FromDataPoints. Because of this, the values from STARTING_WITH_T will never be passed as data points to this Theory.

The second Theory confirms that when a string begins with a ‘T’, String.charAt(int) returns ‘T’ when looking up the first character. Again, notice how the data point names are matched so that they don’t leak into the wrong Theory. They would make poor test data for the first Theory, which assumes that the data points contain two words instead of one.

Running the code above would yield the following output:

The string "Thor" begins with the letter 'T'
The string "Tests" begins with the letter 'T'
The string "This" begins with the letter 'T'
The string "Hello World" is split into 2 parts
The string "Good morning" is split into 2 parts
The string "What's up" is split into 2 parts

Using Assumptions

By adding Assumptions to a Theory we can tell the test runner that it should be ignored if the Assumptions are not met. This could be used to restrict which data points a Theory is run with, but it could also be used to check something completely different, like which operating system we’re using, or whether a certain environment variable exists.

We add Assumptions by inserting calls to the static methods of the JUnit Assume class, such as Assume.assumeThat(T, Matcher<T>) (JavaDoc). Once such a statement is reached, the Assumption is evaluated, and if false, the rest of the Theory is skipped for that particular combination of data points. This is different from the JUnit Assert methods, which fail a Theory rather than marking it as ignored.

You should also note a major difference from using the @FromDataPoints annotation; any Assumption violations won’t actually stop the Theory from being run, they will just cause the Theory to complete prematurely where the violation occurred. In other words, the Theory will run, even if it doesn’t get very far before being interrupted by an Assumption violation. Because of this it’s generally a good idea to put Assumptions at the very beginning of a Theory to avoid a lot of unnecessary setup overhead.

package com.thomaskasene.theory;

import org.junit.experimental.theories.DataPoints;
import org.junit.experimental.theories.Theories;
import org.junit.experimental.theories.Theory;
import org.junit.runner.RunWith;

import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;
import static org.junit.Assume.assumeThat;

@RunWith(Theories.class)
public class MathMaximumTest {

    @DataPoints
    public static final int[] VALUES = {-10, -5, 0, 5, 10};

    @Theory
    public void returnsFirstValueWhenItIsGreatest(int firstValue, int secondValue) {
        assumeThat(firstValue > secondValue, is(true));
        int result = Math.max(firstValue, secondValue);
        assertThat(result, is(firstValue));
        System.out.printf("%s: The greatest of the two values %d and %d is %d%n",
                "First value Theory", firstValue, secondValue, result);
    }

    @Theory
    public void returnsSecondValueWhenItIsGreatest(int firstValue, int secondValue) {
        assumeThat(firstValue < secondValue, is(true));
        int result = Math.max(firstValue, secondValue);
        assertThat(result, is(secondValue));
        System.out.printf("%s: The greatest of the two values %d and %d is %d%n",
                "Second value Theory", firstValue, secondValue, result);
    }
}

Without the highlighted lines in the code above, we would encounter an assertion error because not all valid combinations of data points would make sense to these Theories. For example, the first Theory would fail if firstValue is -5 and secondValue is 5.

When the class above is run in its entirety, the output would be something like this:

First value Theory: The greatest of the two values -5 and -10 is -5
First value Theory: The greatest of the two values 0 and -10 is 0
First value Theory: The greatest of the two values 0 and -5 is 0
First value Theory: The greatest of the two values 5 and -10 is 5
First value Theory: The greatest of the two values 5 and -5 is 5
First value Theory: The greatest of the two values 5 and 0 is 5
First value Theory: The greatest of the two values 10 and -10 is 10
First value Theory: The greatest of the two values 10 and -5 is 10
First value Theory: The greatest of the two values 10 and 0 is 10
First value Theory: The greatest of the two values 10 and 5 is 10
Second value Theory: The greatest of the two values -10 and -5 is -5
Second value Theory: The greatest of the two values -10 and 0 is 0
Second value Theory: The greatest of the two values -10 and 5 is 5
Second value Theory: The greatest of the two values -10 and 10 is 10
Second value Theory: The greatest of the two values -5 and 0 is 0
Second value Theory: The greatest of the two values -5 and 5 is 5
Second value Theory: The greatest of the two values -5 and 10 is 10
Second value Theory: The greatest of the two values 0 and 5 is 5
Second value Theory: The greatest of the two values 0 and 10 is 10
Second value Theory: The greatest of the two values 5 and 10 is 10

Notice how the cases where firstValue and secondValue are equal have been skipped. This is because of the Assumptions we added to our Theories. If we wanted to include those cases as well, one solution would be to add another Theory and make it assume that firstValue and secondValue are identical before carrying on.

Summary

Theories are parameterized unit tests that can accept any number of parameters, and the Theories runner makes sure each Theory is run once for each combination of arguments. These arguments, or data points, can be declared with the @DataPoint and @DataPoints annotations, and each Theory can filter out data points that are irrelevant through the use of an annotation called @FromDataPoints, or by making one or more Assumptions. Theories and Tests are marked as ignored when they make an Assumption that is incorrect.

While still an experimental feature of JUnit, Theories seem to be working quite well, and can sometimes greatly reduce the size and complexity of large unit test classes if used correctly. I find that the most difficult part of using Theories is to decide how to organize the data points when there’s a lot of them.

Links and Resources

I’ve linked to several JavaDoc pages throughout the post, but here’s a complete list over the JUnit features I’ve touched upon:

If you want to get the code examples above working you’ll need JUnit 4 with minor version 12 or greater, as that’s when @FromDataPoints was introduced. If you want something functional to play around with, I’ve set up a GitHub repository that contains most of the code from this post. You can find it here.

Thomas Kåsene

Thomas Kåsene

Thomas is a certified Java professional and focuses on writing clean code that is easy to read, unit test and maintain. In addition to having worked on large enterprise applications, he's also developed several Android apps and websites through the years. In 2011, Thomas graduated with a Bachelor of Science in IT from the then Norwegian School of Information Technology.

More Posts - Website

Follow Me:
TwitterLinkedIn

unit tested functionality

Unit Tests – A Crash Course

Unit testing is about writing a separate piece of code in addition to your production code. The purpose of this non-production code (or unit tests) is to indicate whether or not the production code works as intended.

The phrase “unit testing” means that we identify a coherent unit of code and test that in isolation. A “unit” is often a single class, but not necessarily; it can be a group of classes working together, as much as a single method.

A Quick Example

As an example, suppose you’re writing a mathematics library and you want to make sure that the division function actually takes one number and divides it by another. You might implement this function as follows:

public class Mathematics {

	public static float divide(float a, float b) {
		return a / b;
	}
}

Assuming your project has all the required dependencies, the unit test you write might look something like this:

public class MathematicsTest {

	@Test
	public void dividesAByB() {
		float result = Mathematics.divide(9, 3);
		assertThat(result, is(3));
	}
}

Whenever MathematicsTest.dividesAByB() is run as a unit test, it will try to divide 9 by 3 using the library function you wrote. If the function works, the test will pass. If somebody (including yourself) changes the divide function at some point, the unit test will most likely fail the next time it’s run.

Writing Unit Tests

There are different approaches to writing unit tests. Some programmers like to write the unit tests after the target functionality has been implemented, while others prefer to write the unit tests first, and then to let those unit tests drive the design of the production code. The latter approach is what’s commonly referred to as test-driven development, or TDD, and is by many regarded as the best way to achieve and maintain clean code in their projects.

Summary

I’ve just given you a very small glimpse of what unit tests are and why we use them. There is much more to say on the subject than what I’ve mentioned here, and I’ll probably cover some of it in future blog posts. Until then, you now have enough to go on if you want to study it on your own in more detail.

Thomas Kåsene

Thomas Kåsene

Thomas is a certified Java professional and focuses on writing clean code that is easy to read, unit test and maintain. In addition to having worked on large enterprise applications, he's also developed several Android apps and websites through the years. In 2011, Thomas graduated with a Bachelor of Science in IT from the then Norwegian School of Information Technology.

More Posts - Website

Follow Me:
TwitterLinkedIn