Home | Read Online | Amazon | GoodReads | Google Books | PDF (code) | GitHub
This section provides an introduction to software testing and the testing of Artificial Intelligence algorithms. introduces software testing and focuses on a type of testing relevant to algorithms called unit testing. provides a specific example of an algorithm and a prepared suite of unit tests, and provides some rules-of-thumb for testing algorithms in general.
Software testing in the field of Software Engineering is a process in the life-cycle of a software project that verifies that the product or service meets quality expectations and validates that software meets the requirements specification. Software testing is intended to locate defects in a program, although a given testing method cannot guarantee to locate all defects. As such, it is common for an application to be subjected to a range of testing methodologies throughout the software life-cycle, such as unit testing during development, integration testing once modules and systems are completed, and user acceptance testing to allow the stakeholders to determine if their needs have been met.
Unit testing is a type of software testing that involves the preparation of well-defined procedural tests of discrete functionality of a program that provide confidence that a module or function behaves as intended. Unit tests are referred to as 'white-box' tests (contrasted to 'black-box' tests) because they are written with full knowledge of the internal structure of the functions and modules under tests. Unit tests are typically prepared by the developer that wrote the code under test and are commonly automated, themselves written as small programmers that are executed by a unit testing framework (such as JUnit for Java or the Test framework in Ruby). The objective is not to test each path of execution within a unit (called complete-test or complete-code coverage), but instead to focus tests on areas of risk, uncertainty, or criticality. Each test focuses on one aspect of the code (test one thing) and are commonly organized into test suites of commonality.
Some of the benefits of unit testing include:
Unit tests were traditionally written after the program was completed. A popular alternative is to prepare the tests before the functionality of the application is prepared, called Test-First or Test-Driven Development (TDD). In this method, the tests are written and executed, failing until the application functionality is written to make the test pass. The early preparation of tests allow the programmer to consider the behavior required from the program and the interfaces and functions the program needs to expose before they are written.
The concerns of software testing are very relevant to the development, investigation, and application of Metaheuristic and Computational Intelligence algorithms. In particular, the strong culture of empirical investigation and prototype-based development demands a baseline level of trust in the systems that are presented in articles and papers. Trust can be instilled in an algorithm by assessing the quality of the algorithm implementation itself. Unit testing is lightweight (requiring only the writing of automated test code) and meets the needs of promoting quality and trust in the code while prototyping and developing algorithms. It is strongly suggested as a step in the process of empirical algorithm research in the fields of Metaheuristics, Computational Intelligence, and Biologically Inspired Computation.
This section provides an example of an algorithm and its associated unit tests as an illustration of the presented concepts. The implementation of the Genetic Algorithm is discussed from the perspective of algorithm testing and an example set of unit tests for the Genetic Algorithm implementation are presented as a case study.
Listing (below) in provides the source code for the Genetic Algorithm in the Ruby Programming Language. Important considerations when in using the Ruby test framework, is ensuring that the functions of the algorithm are exposed for testing and that the algorithm demonstration itself does not execute. This is achieved through the use of the (if __FILE__ == $0
) condition, which ensures the example only executes when the file is called directly, allowing the functions to be imported and executed independently by a unit test script.
The algorithm is very modular with its behavior partitioned into small functions, most of which are independently testable.
The reproduce
function has some dependencies although its orchestration of sub-functions is still testable. The search
function is the only monolithic function, which both depends on all other functions in the implementation (directly or indirectly) and hence is difficult to unit test. At best, the search
function may be a case for system testing addressing functional requirements, such as "does the algorithm deliver optimized solutions".
Listing (below) provides the TC_GeneticAlgorithm
class that makes use of the built-in Ruby unit testing framework by extending the TestCase
class.
The listing provides an example of ten unit tests for six of the functions in the Genetic Algorithm implementation. Two types of unit tests are provided:
onemax
add correctly? and does point_mutation
behave correctly?random_bitstring
provide an expected 50/50 mixture of 1s and 0s over a large number of cases? and does point_mutation
make an expected number of changes over a large number of cases?The tests for probabilistic expectations is a weaker form of unit testing that can be used to either provide additional confidence to deterministically tested functions, or to be used as a last resort when direct methods cannot be used.
Given that a unit test should 'test one thing' it is common for a given function to have more than one unit tests. The reproduce
function is a good example of this with three tests in the suite. This is because it is a larger function with behavior called in dependent functions which is varied based on parameters.
require "test/unit" require File.expand_path(File.dirname(__FILE__)) + "/../genetic_algorithm" class TC_GeneticAlgorithm < Test::Unit::TestCase # test that the objective function behaves as expected def test_onemax assert_equal(0, onemax("0000")) assert_equal(4, onemax("1111")) assert_equal(2, onemax("1010")) end # test the creation of random strings def test_random_bitstring assert_equal(10, random_bitstring(10).size) assert_equal(0, random_bitstring(10).delete('0').delete('1').size) end # test the approximate proportion of 1's and 0's def test_random_bitstring_ratio s = random_bitstring(1000) assert_in_delta(0.5, (s.delete('1').size/1000.0), 0.05) assert_in_delta(0.5, (s.delete('0').size/1000.0), 0.05) end # test that members of the population are selected def test_binary_tournament pop = Array.new(10) {|i| {:fitness=>i} } 10.times {assert(pop.include?(binary_tournament(pop)))} end # test point mutations at the limits def test_point_mutation assert_equal("0000000000", point_mutation("0000000000", 0)) assert_equal("1111111111", point_mutation("1111111111", 0)) assert_equal("1111111111", point_mutation("0000000000", 1)) assert_equal("0000000000", point_mutation("1111111111", 1)) end # test that the observed changes approximate the intended probability def test_point_mutation_ratio changes = 0 100.times do s = point_mutation("0000000000", 0.5) changes += (10 - s.delete('1').size) end assert_in_delta(0.5, changes.to_f/(100*10), 0.05) end # test cloning with crossover def test_crossover_clone p1, p2 = "0000000000", "1111111111" 100.times do s = crossover(p1, p2, 0) assert_equal(p1, s) assert_not_same(p1, s) end end # test recombination with crossover def test_crossover_recombine p1, p2 = "0000000000", "1111111111" 100.times do s = crossover(p1, p2, 1) assert_equal(p1.size, s.size) assert_not_equal(p1, s) assert_not_equal(p2, s) s.size.times {|i| assert( (p1[i]==s[i]) || (p2[i]==s[i]) ) } end end # test odd sized population def test_reproduce_odd pop = Array.new(9) {|i| {:fitness=>i,:bitstring=>"0000000000"} } children = reproduce(pop, pop.size, 0, 1) assert_equal(9, children.size) end # test reproduce size mismatch def test_reproduce_mismatch pop = Array.new(10) {|i| {:fitness=>i,:bitstring=>"0000000000"} } children = reproduce(pop, 9, 0, 0) assert_equal(9, children.size) end end
Unit testing is easy, although writing good unit tests is difficult given the complex relationship the tests have with the code under test. Testing Metaheuristics and Computational Intelligence algorithms is harder again given their probabilistic nature and their ability to 'work in spite of you', that is, provide some kind of result even when implemented with defects.
The following guidelines may help when unit testing an algorithm:
For more information on software testing, consult a good book on software engineering. Two good books dedicated to testing are "Beautiful Testing: Leading Professionals Reveal How They Improve Software" that provides a compendium of best practices from professional programers and testers [Goucher2009], and "Software testing" by Patton that provides a more traditional treatment [Patton2005].
Unit testing is covered in good books on software engineering or software testing. Two good books that focus on unit testing include "Test Driven Development: By Example" on the TDD methodology by Beck, a pioneer of Extreme Programming and Test Drive Development [Beck2002] and "Pragmatic unit testing in Java with JUnit" by Hunt and Thomas [Hunt2003].
[Beck2002] | K. Beck, "Test Driven Development: By Example", Addison-Wesley Professional, 2002. |
[Goucher2009] | A. Goucher and T. Riley (editors), "Beautiful Testing: Leading Professionals Reveal How They Improve\n\tSoftware", O'Reilly Media, 2009. |
[Hunt2003] | A. Hunt and D. Thomas, "Pragmatic unit testing in Java with JUnit", Pragmatic Bookshelf, 2003. |
[Patton2005] | R. Patton, "Software testing", Sams, 2005. |
Please Note: This content was automatically generated from the book content and may contain minor differences.