Home | Read Online | Amazon | GoodReads | Google Books | PDF (code) | GitHub
Genetic Programming, GP.
The Genetic Programming algorithm is an example of an Evolutionary Algorithm and belongs to the field of Evolutionary Computation and more broadly Computational Intelligence and Biologically Inspired Computation. The Genetic Programming algorithm is a sibling to other Evolutionary Algorithms such as the Genetic Algorithm, Evolution Strategies, Evolutionary Programming, and Learning Classifier Systems. Technically, the Genetic Programming algorithm is an extension of the Genetic Algorithm. The Genetic Algorithm is a parent to a host of variations and extensions.
The Genetic Programming algorithm is inspired by population genetics (including heredity and gene frequencies), and evolution at the population level, as well as the Mendelian understanding of the structure (such as chromosomes, genes, alleles) and mechanisms (such as recombination and mutation). This is the so-called new or modern synthesis of evolutionary biology.
Individuals of a population contribute their genetic material (called the genotype) proportional to their suitability of their expressed genome (called their phenotype) to their environment. The next generation is created through a process of mating that involves genetic operators such as recombination of two individuals genomes in the population and the introduction of random copying errors (called mutation). This iterative process may result in an improved adaptive-fit between the phenotypes of individuals in a population and the environment.
Programs may be evolved and used in a secondary adaptive process, where an assessment of candidates at the end of that secondary adaptive process is used for differential reproductive success in the first evolutionary process. This system may be understood as the inter-dependencies experienced in evolutionary development where evolution operates upon an embryo that in turn develops into an individual in an environment that eventually may reproduce.
The objective of the Genetic Programming algorithm is to use induction to devise a computer program. This is achieved by using evolutionary operators on candidate programs with a tree structure to improve the adaptive fit between the population of candidate programs and an objective function. An assessment of a candidate solution involves its execution.
Algorithm (below) provides a pseudocode listing of the Genetic Programming algorithm for minimizing a cost function, based on Koza and Poli's tutorial [Koza2005].
The Genetic Program uses LISP-like symbolic expressions called S-expressions that represent the graph of a program with function nodes and terminal nodes. While the algorithm is running, the programs are treated like data, and when they are evaluated they are executed. The traversal of a program graph is always depth first, and functions must always return a value.
Input
:
$Population_{size}$, $nodes_{func}$, $nodes_{term}$, $P_{crossover}$, $P_{mutation}$, $P_{reproduction}$, $P_{alteration}$
Output
:
$S_{best}$
Population
$\leftarrow$ InitializePopulation
{$Population_{size}$, $nodes_{func}$, $nodes_{term}$}EvaluatePopulation
{Population
}GetBestSolution
{Population
}While
($\neg$StopCondition
())Children
$\leftarrow \emptyset$While
(Size
{Children
} $<$ $Population_{size}$)Operator
$\leftarrow$ SelectGeneticOperator
{$P_{crossover}$, $P_{mutation}$, $P_{reproduction}$, $P_{alteration}$}If
(Operator
$\equiv$ CrossoverOperator
)SelectParents
{Population
, $Population_{size}$}Crossover
{$Parent_{1}$, $Parent_{2}$}Children
$\leftarrow$ $Child_{1}$Children
$\leftarrow$ $Child_{2}$ElseIf
(Operator
$\equiv$ MutationOperator
)SelectParents
{Population
, $Population_{size}$}Mutate
{$Parent_{1}$}Children
$\leftarrow$ $Child_{1}$ElseIf
(Operator
$\equiv$ ReproductionOperator
)SelectParents
{Population
, $Population_{size}$}Reproduce
{$Parent_{1}$}Children
$\leftarrow$ $Child_{1}$ElseIf
(Operator
$\equiv$ AlterationOperator
)SelectParents
{Population
, $Population_{size}$}AlterArchitecture
{$Parent_{1}$}Children
$\leftarrow$ $Child_{1}$End
End
EvaluatePopulation
{Children
}GetBestSolution
{Children
, $S_{best}$}Population
$\leftarrow$ Children
End
Return
($S_{best}$)Listing (below) provides an example of the Genetic Programming algorithm implemented in the Ruby Programming Language based on Koza and Poli's tutorial [Koza2005].
The demonstration problem is an instance of a symbolic regression, where a function must be devised to match a set of observations. In this case the target function is a quadratic polynomial $x^2+x+1$ where $x \in [-1,1]$. The observations are generated directly from the target function without noise for the purposes of this example. In practical problems, if one knew and had access to the target function then the genetic program would not be required.
The algorithm is configured to search for a program with the function set ${ +, -, \times, \div }$ and the terminal set ${ X, R }$, where $X$ is the input value, and $R$ is a static random variable generated for a program $X \in [-5,5]$. A division by zero returns a value of one. The fitness of a candidate solution is calculated by evaluating the program on range of random input values and calculating the Root Mean Squared Error (RMSE). The algorithm is configured with a 90% probability of crossover, 8% probability of reproduction (copying), and a 2% probability of mutation. For brevity, the algorithm does not implement the architecture altering genetic operation and does not bias crossover points towards functions over terminals.
def rand_in_bounds(min, max) return min + (max-min)*rand() end def print_program(node) return node if !node.kind_of?(Array) return "(#{node[0]} #{print_program(node[1])} #{print_program(node[2])})" end def eval_program(node, map) if !node.kind_of?(Array) return map[node].to_f if !map[node].nil? return node.to_f end arg1, arg2 = eval_program(node[1], map), eval_program(node[2], map) return 0 if node[0] === :/ and arg2 == 0.0 return arg1.__send__(node[0], arg2) end def generate_random_program(max, funcs, terms, depth=0) if depth==max-1 or (depth>1 and rand()<0.1) t = terms[rand(terms.size)] return ((t=='R') ? rand_in_bounds(-5.0, +5.0) : t) end depth += 1 arg1 = generate_random_program(max, funcs, terms, depth) arg2 = generate_random_program(max, funcs, terms, depth) return [funcs[rand(funcs.size)], arg1, arg2] end def count_nodes(node) return 1 if !node.kind_of?(Array) a1 = count_nodes(node[1]) a2 = count_nodes(node[2]) return a1+a2+1 end def target_function(input) return input**2 + input + 1 end def fitness(program, num_trials=20) sum_error = 0.0 num_trials.times do |i| input = rand_in_bounds(-1.0, 1.0) error = eval_program(program, {'X'=>input}) - target_function(input) sum_error += error.abs end return sum_error / num_trials.to_f end def tournament_selection(pop, bouts) selected = Array.new(bouts){pop[rand(pop.size)]} selected.sort!{|x,y| x[:fitness]<=>y[:fitness]} return selected.first end def replace_node(node, replacement, node_num, cur_node=0) return [replacement,(cur_node+1)] if cur_node == node_num cur_node += 1 return [node,cur_node] if !node.kind_of?(Array) a1, cur_node = replace_node(node[1], replacement, node_num, cur_node) a2, cur_node = replace_node(node[2], replacement, node_num, cur_node) return [[node[0], a1, a2], cur_node] end def copy_program(node) return node if !node.kind_of?(Array) return [node[0], copy_program(node[1]), copy_program(node[2])] end def get_node(node, node_num, current_node=0) return node,(current_node+1) if current_node == node_num current_node += 1 return nil,current_node if !node.kind_of?(Array) a1, current_node = get_node(node[1], node_num, current_node) return a1,current_node if !a1.nil? a2, current_node = get_node(node[2], node_num, current_node) return a2,current_node if !a2.nil? return nil,current_node end def prune(node, max_depth, terms, depth=0) if depth == max_depth-1 t = terms[rand(terms.size)] return ((t=='R') ? rand_in_bounds(-5.0, +5.0) : t) end depth += 1 return node if !node.kind_of?(Array) a1 = prune(node[1], max_depth, terms, depth) a2 = prune(node[2], max_depth, terms, depth) return [node[0], a1, a2] end def crossover(parent1, parent2, max_depth, terms) pt1, pt2 = rand(count_nodes(parent1)-2)+1, rand(count_nodes(parent2)-2)+1 tree1, c1 = get_node(parent1, pt1) tree2, c2 = get_node(parent2, pt2) child1, c1 = replace_node(parent1, copy_program(tree2), pt1) child1 = prune(child1, max_depth, terms) child2, c2 = replace_node(parent2, copy_program(tree1), pt2) child2 = prune(child2, max_depth, terms) return [child1, child2] end def mutation(parent, max_depth, functs, terms) random_tree = generate_random_program(max_depth/2, functs, terms) point = rand(count_nodes(parent)) child, count = replace_node(parent, random_tree, point) child = prune(child, max_depth, terms) return child end def search(max_gens, pop_size, max_depth, bouts, p_repro, p_cross, p_mut, functs, terms) population = Array.new(pop_size) do |i| {:prog=>generate_random_program(max_depth, functs, terms)} end population.each{|c| c[:fitness] = fitness(c[:prog])} best = population.sort{|x,y| x[:fitness] <=> y[:fitness]}.first max_gens.times do |gen| children = [] while children.size < pop_size operation = rand() p1 = tournament_selection(population, bouts) c1 = {} if operation < p_repro c1[:prog] = copy_program(p1[:prog]) elsif operation < p_repro+p_cross p2 = tournament_selection(population, bouts) c2 = {} c1[:prog],c2[:prog] = crossover(p1[:prog], p2[:prog], max_depth, terms) children << c2 elsif operation < p_repro+p_cross+p_mut c1[:prog] = mutation(p1[:prog], max_depth, functs, terms) end children << c1 if children.size < pop_size end children.each{|c| c[:fitness] = fitness(c[:prog])} population = children population.sort!{|x,y| x[:fitness] <=> y[:fitness]} best = population.first if population.first[:fitness] <= best[:fitness] puts " > gen #{gen}, fitness=#{best[:fitness]}" break if best[:fitness] == 0 end return best end if __FILE__ == $0 # problem configuration terms = ['X', 'R'] functs = [:+, :-, :*, :/] # algorithm configuration max_gens = 100 max_depth = 7 pop_size = 100 bouts = 5 p_repro = 0.08 p_cross = 0.90 p_mut = 0.02 # execute the algorithm best = search(max_gens, pop_size, max_depth, bouts, p_repro, p_cross, p_mut, functs, terms) puts "done! Solution: f=#{best[:fitness]}, #{print_program(best[:prog])}" end
An early work by Cramer involved the study of a Genetic Algorithm using an expression tree structure for representing computer programs for primitive mathematical operations [Cramer1985]. Koza is credited with the development of the field of Genetic Programming. An early paper by Koza referred to his hierarchical genetic algorithms as an extension to the simple genetic algorithm that use symbolic expressions (S-expressions) as a representation and were applied to a range of induction-style problems [Koza1989]. The seminal reference for the field is Koza's 1992 book on Genetic Programming [Koza1992].
The field of Genetic Programming is vast, including many books, dedicated conferences and thousands of publications. Koza is generally credited with the development and popularizing of the field, publishing a large number of books and papers himself. Koza provides a practical introduction to the field as a tutorial and provides recent overview of the broader field and usage of the technique [Koza2005].
In addition his the seminal 1992 book, Koza has released three more volumes in the series including volume II on Automatically Defined Functions (ADFs) [Koza1994], volume III that considered the Genetic Programming Problem Solver (GPPS) for automatically defining the function set and program structure for a given problem [Koza1999], and volume IV that focuses on the human competitive results the technique is able to achieve in a routine manner [Koza2003]. All books are rich with targeted and practical demonstration problem instances.
Some additional excellent books include a text by Banzhaf et al. that provides an introduction to the field [Banzhaf1998], Langdon and Poli's detailed look at the technique [Langdon2002], and Poli, Langdon, and McPhee's contemporary and practical field guide to Genetic Programming [Poli2008].
[Angeline1996] | P. J. Angeline, "Two Self-Adaptive Crossover Operators for Genetic Programming", in Advances in Genetic Programming 2, 1996. |
[Banzhaf1998] | W. Banzhaf and P. Nordin and R. E. Keller and F. D. Francone, "Genetic Programming – An Introduction; On the Automatic Evolution\n\tof Computer Programs and its Applications", Morgan Kaufmann, 1998. |
[Cramer1985] | N. L. Cramer, "A Representation for the Adaptive Generation of Simple Sequential\n\tPrograms", in Proceedings of the 1st International Conference on Genetic Algorithms, 1985. |
[Koza1989] | J. R. Koza, "Hierarchical genetic algorithms operating on populations of computer\n\tprograms", in Proceedings of the Eleventh International Joint Conference on Artificial\n\tIntelligence IJCAI-89, 1989. |
[Koza1992] | J. R. Koza, "Genetic programming: On the programming of computers by means of\n\tnatural selection", MIT Press, 1992. |
[Koza1994] | J. R. Koza, "Genetic programming II: Automatic discovery of reusable programs", MIT Press, 1994. |
[Koza1999] | J. R. Koza and F. H. Bennett III and D. Andre and M. A. Keane, "Genetic programming III: Darwinian invention and problem solving", Morgan Kaufmann, 1999. |
[Koza2003] | J. R. Koza and M. A. Keane and M. J. Streeter and W. Mydlowec and\n\tJ. Yu and G. Lanza, "Genetic Programming IV: Routine Human-Competitive Machine Intelligence", Springer, 2003. |
[Koza2005] | J. R. Koza and R. Poli, "5: Genetic Programming", in Search methodologies: Introductory tutorials in optimization and\n\tdecision support techniques, pages 127–164, Springer, 2005. |
[Langdon2002] | W. B. Langdon and R. Poli, "Foundations of Genetic Programming", Springer-Verlag, 2002. |
[Perkis1994] | T. Perkis, "Stack-Based Genetic Programming", in Proc IEEE Congress on Computational Intelligence, 1994. |
[Poli2008] | R. Poli and W. B. Langdon and N. F. McPhee, "A Field Programmers Guide to Genetic Programming", Lulu Enterprises, 2008. |
Please Note: This content was automatically generated from the book content and may contain minor differences.