Clever Algorithms: Nature-Inspired Programming Recipes

A book by Jason Brownlee

Testing Algorithms

This section provides an introduction to software testing and the testing of Artificial Intelligence algorithms. introduces software testing and focuses on a type of testing relevant to algorithms called unit testing. provides a specific example of an algorithm and a prepared suite of unit tests, and provides some rules-of-thumb for testing algorithms in general.

Software Testing

Software testing in the field of Software Engineering is a process in the life-cycle of a software project that verifies that the product or service meets quality expectations and validates that software meets the requirements specification. Software testing is intended to locate defects in a program, although a given testing method cannot guarantee to locate all defects. As such, it is common for an application to be subjected to a range of testing methodologies throughout the software life-cycle, such as unit testing during development, integration testing once modules and systems are completed, and user acceptance testing to allow the stakeholders to determine if their needs have been met.

Unit testing is a type of software testing that involves the preparation of well-defined procedural tests of discrete functionality of a program that provide confidence that a module or function behaves as intended. Unit tests are referred to as 'white-box' tests (contrasted to 'black-box' tests) because they are written with full knowledge of the internal structure of the functions and modules under tests. Unit tests are typically prepared by the developer that wrote the code under test and are commonly automated, themselves written as small programmers that are executed by a unit testing framework (such as JUnit for Java or the Test framework in Ruby). The objective is not to test each path of execution within a unit (called complete-test or complete-code coverage), but instead to focus tests on areas of risk, uncertainty, or criticality. Each test focuses on one aspect of the code (test one thing) and are commonly organized into test suites of commonality.

Some of the benefits of unit testing include:

Documentation: The preparation of a suite of tests for a given system provide a type of programming documentation highlighting the expected behavior of functions and modules and providing examples of how to interact with key components.
Readability: Unit testing encourages a programming style of small modules, clear input and output and fewer inter-component dependencies. Code written for easy of testing (testability) may be easier to read and follow.
Regression: Together, the suite of tests can be executed as a regression-test of the system. The automation of the tests means that any defects caused by changes to the code can easily be identified. When a defect is found that slipped through, a new test can be written to ensure it will be identified in the future.

Unit tests were traditionally written after the program was completed. A popular alternative is to prepare the tests before the functionality of the application is prepared, called Test-First or Test-Driven Development (TDD). In this method, the tests are written and executed, failing until the application functionality is written to make the test pass. The early preparation of tests allow the programmer to consider the behavior required from the program and the interfaces and functions the program needs to expose before they are written.

The concerns of software testing are very relevant to the development, investigation, and application of Metaheuristic and Computational Intelligence algorithms. In particular, the strong culture of empirical investigation and prototype-based development demands a baseline level of trust in the systems that are presented in articles and papers. Trust can be instilled in an algorithm by assessing the quality of the algorithm implementation itself. Unit testing is lightweight (requiring only the writing of automated test code) and meets the needs of promoting quality and trust in the code while prototyping and developing algorithms. It is strongly suggested as a step in the process of empirical algorithm research in the fields of Metaheuristics, Computational Intelligence, and Biologically Inspired Computation.

Unit Testing Example

This section provides an example of an algorithm and its associated unit tests as an illustration of the presented concepts. The implementation of the Genetic Algorithm is discussed from the perspective of algorithm testing and an example set of unit tests for the Genetic Algorithm implementation are presented as a case study.

Algorithm

Listing (below) in provides the source code for the Genetic Algorithm in the Ruby Programming Language. Important considerations when in using the Ruby test framework, is ensuring that the functions of the algorithm are exposed for testing and that the algorithm demonstration itself does not execute. This is achieved through the use of the (if __FILE__ == $0) condition, which ensures the example only executes when the file is called directly, allowing the functions to be imported and executed independently by a unit test script. The algorithm is very modular with its behavior partitioned into small functions, most of which are independently testable.

The reproduce function has some dependencies although its orchestration of sub-functions is still testable. The search function is the only monolithic function, which both depends on all other functions in the implementation (directly or indirectly) and hence is difficult to unit test. At best, the search function may be a case for system testing addressing functional requirements, such as "does the algorithm deliver optimized solutions".

Unit Tests

Listing (below) provides the TC_GeneticAlgorithm class that makes use of the built-in Ruby unit testing framework by extending the TestCase class. The listing provides an example of ten unit tests for six of the functions in the Genetic Algorithm implementation. Two types of unit tests are provided:

Deterministic: Directly test the function in question, addressing questions such as: does onemax add correctly? and does point_mutation behave correctly?
Probabilistic: Test the probabilistic properties of the function in question, addressing questions such as: does random_bitstring provide an expected 50/50 mixture of 1s and 0s over a large number of cases? and does point_mutation make an expected number of changes over a large number of cases?

The tests for probabilistic expectations is a weaker form of unit testing that can be used to either provide additional confidence to deterministically tested functions, or to be used as a last resort when direct methods cannot be used.

Given that a unit test should 'test one thing' it is common for a given function to have more than one unit tests. The reproduce function is a good example of this with three tests in the suite. This is because it is a larger function with behavior called in dependent functions which is varied based on parameters.

require "test/unit"
require File.expand_path(File.dirname(__FILE__)) + "/../genetic_algorithm"

class TC_GeneticAlgorithm < Test::Unit::TestCase

  # test that the objective function behaves as expected
  def test_onemax
    assert_equal(0, onemax("0000"))
    assert_equal(4, onemax("1111"))
    assert_equal(2, onemax("1010"))
  end

  # test the creation of random strings
  def test_random_bitstring
    assert_equal(10, random_bitstring(10).size)
    assert_equal(0, random_bitstring(10).delete('0').delete('1').size)
  end

  # test the approximate proportion of 1's and 0's
  def test_random_bitstring_ratio
    s = random_bitstring(1000)
    assert_in_delta(0.5, (s.delete('1').size/1000.0), 0.05)
    assert_in_delta(0.5, (s.delete('0').size/1000.0), 0.05)
  end

  # test that members of the population are selected
  def test_binary_tournament
    pop = Array.new(10) {|i| {:fitness=>i} }
    10.times {assert(pop.include?(binary_tournament(pop)))}
  end

  # test point mutations at the limits
  def test_point_mutation
    assert_equal("0000000000", point_mutation("0000000000", 0))
    assert_equal("1111111111", point_mutation("1111111111", 0))
    assert_equal("1111111111", point_mutation("0000000000", 1))
    assert_equal("0000000000", point_mutation("1111111111", 1))
  end

  # test that the observed changes approximate the intended probability
  def test_point_mutation_ratio
    changes = 0
    100.times do
      s = point_mutation("0000000000", 0.5)
      changes += (10 - s.delete('1').size)
    end
    assert_in_delta(0.5, changes.to_f/(100*10), 0.05)
  end

  # test cloning with crossover
  def test_crossover_clone
    p1, p2 = "0000000000", "1111111111"
    100.times do
      s = crossover(p1, p2, 0)
      assert_equal(p1, s)
      assert_not_same(p1, s)
    end
  end

  # test recombination with crossover
  def test_crossover_recombine
    p1, p2 = "0000000000", "1111111111"
    100.times do
      s = crossover(p1, p2, 1)
      assert_equal(p1.size, s.size)
      assert_not_equal(p1, s)
      assert_not_equal(p2, s)
      s.size.times {|i| assert( (p1[i]==s[i]) || (p2[i]==s[i]) ) }
    end
  end

  # test odd sized population
  def test_reproduce_odd
    pop = Array.new(9) {|i| {:fitness=>i,:bitstring=>"0000000000"} }
    children = reproduce(pop, pop.size, 0, 1)
    assert_equal(9, children.size)
  end

  # test reproduce size mismatch
  def test_reproduce_mismatch
    pop = Array.new(10) {|i| {:fitness=>i,:bitstring=>"0000000000"} }
    children = reproduce(pop, 9, 0, 0)
    assert_equal(9, children.size)
  end
end

Unit Tests for the Genetic Algorithm in Ruby

Download: tc_genetic_algorithm.rb.

Rules-of-Thumb

Unit testing is easy, although writing good unit tests is difficult given the complex relationship the tests have with the code under test. Testing Metaheuristics and Computational Intelligence algorithms is harder again given their probabilistic nature and their ability to 'work in spite of you', that is, provide some kind of result even when implemented with defects.

The following guidelines may help when unit testing an algorithm:

Start Small: Some unit tests are better than no unit test and each additional test can improve the trust and the quality of the code. For an existing algorithm implementation, start by writing a test for a small and simple behavior and slowly build up a test suite.
Test one thing: Each test should focus on verifying the behavior of one aspect of one unit of code. Writing concise and behavior-focused unit tests are the objective of the methodology.
Test once: A behavior or expectation only needs to be tested once, do not repeat a test each time a given unit is tested.
Don't forget the I/O: Remember to test the inputs and outputs of a unit of code, specifically the pre-conditions and post-conditions. It can be easy to focus on the decision points within a unit and forget its primary purpose.
Write code for testability: The tests should help to shape the code they test. Write small functions or modules, think about testing while writing code (or write tests first), and refactor code (update code after the fact) to make it easier to test.
Function independence: Attempt to limit the direct dependence between functions, modules, objects and other constructs. This is related to testability and writing small functions although suggests limits on how much interaction there is between units of code in the algorithm. Less dependence means less side-effects of a given unit of code and ultimately less complicated tests.
Test Independence: Test should be independent from each other. Frameworks provide hooks to set-up and tear-down state prior to the execution of each test, there should be no needed to have one test prepare data or state for other tests. Tests should be able to execute independently and in any order.
Test your own code: Avoid writing tests that verify the behavior of framework or library code, such as the randomness of a random number generator or whether a math or string function behaves as expected. Focus on writing test for the manipulation of data performed by the code you have written.
Probabilistic testing: Metaheuristics and Computational Intelligence algorithms generally make use of stochastic or probabilistic decisions. This means that some behaviors are not deterministic and are more difficult to test. As with the example, write probabilistic tests to verify that such processes behave as intended. Given that probabilistic tests are weaker than deterministic tests, consider writing deterministic tests first. A probabilistic behavior can be made deterministic by replacing the random number generator with a proxy that returns deterministic values, called a mock. This level of testing may require further impact to the original code to allow for dependent modules and objects to be mocked.
Consider test-first: Writing the tests first can help to crystallize expectations when implementing an algorithm from the literature, and help to solidify thoughts when developing or prototyping a new idea.

References

For more information on software testing, consult a good book on software engineering. Two good books dedicated to testing are "Beautiful Testing: Leading Professionals Reveal How They Improve Software" that provides a compendium of best practices from professional programers and testers [Goucher2009], and "Software testing" by Patton that provides a more traditional treatment [Patton2005].

Unit testing is covered in good books on software engineering or software testing. Two good books that focus on unit testing include "Test Driven Development: By Example" on the TDD methodology by Beck, a pioneer of Extreme Programming and Test Drive Development [Beck2002] and "Pragmatic unit testing in Java with JUnit" by Hunt and Thomas [Hunt2003].

Bibliography

[Beck2002]	K. Beck, "Test Driven Development: By Example", Addison-Wesley Professional, 2002.
[Goucher2009]	A. Goucher and T. Riley (editors), "Beautiful Testing: Leading Professionals Reveal How They Improve\n\tSoftware", O'Reilly Media, 2009.
[Hunt2003]	A. Hunt and D. Thomas, "Pragmatic unit testing in Java with JUnit", Pragmatic Bookshelf, 2003.
[Patton2005]	R. Patton, "Software testing", Sams, 2005.

Please Note: This content was automatically generated from the book content and may contain minor differences.