Home | Read Online | Amazon | GoodReads | Google Books | PDF (code) | GitHub
Bayesian Optimization Algorithm, BOA.
The Bayesian Optimization Algorithm belongs to the field of Estimation of Distribution Algorithms, also referred to as Population Model-Building Genetic Algorithms (PMBGA) an extension to the field of Evolutionary Computation. More broadly, BOA belongs to the field of Computational Intelligence. The Bayesian Optimization Algorithm is related to other Estimation of Distribution Algorithms such as the Population Incremental Learning Algorithm, and the Univariate Marginal Distribution Algorithm. It is also the basis for extensions such as the Hierarchal Bayesian Optimization Algorithm (hBOA) and the Incremental Bayesian Optimization Algorithm (iBOA).
Bayesian Optimization Algorithm is a technique without an inspiration. It is related to the Genetic Algorithm and other Evolutionary Algorithms that are inspired by the biological theory of evolution by means of natural selection.
The information processing objective of the technique is to construct a probabilistic model that describes the relationships between the components of fit solutions in the problem space. This is achieved by repeating the process of creating and sampling from a Bayesian network that contains the conditional dependancies, independencies, and conditional probabilities between the components of a solution. The network is constructed from the relative frequencies of the components within a population of high fitness candidate solutions. Once the network is constructed, the candidate solutions are discarded and a new population of candidate solutions are generated from the model. The process is repeated until the model converges on a fit prototype solution.
Algorithm (below) provides a pseudocode listing of the Bayesian Optimization Algorithm for minimizing a cost function. The Bayesian network is constructed each iteration using a greedy algorithm. The network is assessed based on its fit of the information in the population of candidate solutions using either a Bayesian Dirichlet Metric (BD) [Pelikan1999a], or a Bayesian Information Criterion (BIC). Refer to Chapter 3 of Pelikan's book for a more detailed presentation of the pseudocode for BOA [Pelikan2005].
Input
:
$Bits_{num}$, $Population_{size}$, $Selection_{size}$
Output
:
$S_{best}$
Population
$\leftarrow$ InitializePopulation
{$Bits_{num}$, $Population_{size}$}EvaluatePopulation
{Population
}GetBestSolution
{Population
}While
($\neg$StopCondition
())Selected
$\leftarrow$ SelectFitSolutions
{Population
, $Selection_{size}$}Model
$\leftarrow$ ConstructBayesianNetwork
{Selected
}Offspring
$\leftarrow$ $\emptyset$For
($i$ To
$Population_{size}$)Offspring
$\leftarrow$ ProbabilisticallyConstructSolution
{Model
}End
EvaluatePopulation
{Offspring
}GetBestSolution
{Offspring
}Population
$\leftarrow$ Combine
{Population
, Offspring
}End
Return
($S_{best}$)Listing (below) provides an example of the Bayesian Optimization Algorithm implemented in the Ruby Programming Language. The demonstration problem is a maximizing binary optimization problem called OneMax that seeks a binary string of unity (all '1' bits). The objective function provides only an indication of the number of correct bits in a candidate string, not the positions of the correct bits.
The Bayesian Optimization Algorithm can be tricky to implement given the use of of a Bayesian Network at the core of the technique. The implementation of BOA provided is based on the the C++ implementation provided by Pelikan, version 1.0 [Pelikan1999b]. Specifically, the implementation uses the K2 metric to construct a Bayesian network from a population of candidate solutions [Cooper1992]. Essentially, this metric is a greedy algorithm that starts with an empty graph and adds the arc with the most gain each iteration until a maximum number of edges have been added or no further edges can be added. The result is a directed acyclic graph. The process that constructs the graph imposes limits, such as the maximum number of edges and the maximum number of in-bound connections per node.
New solutions are sampled from the graph by first topologically ordering the graph (so that bits can be generated based on their dependencies), then probabilistically sampling the bits based on the conditional probabilities encoded in the graph. The algorithm used for sampling the conditional probabilities from the network is Probabilistic Logic Sampling [Henrion1988]. The stopping condition is either the best solution for the problem is found or the system converges to a single bit pattern.
Given that the implementation was written for clarity, it is slow to execute and provides an great opportunity for improvements and efficiencies.
def onemax(vector) return vector.inject(0){|sum, value| sum + value} end def random_bitstring(size) return Array.new(size){ ((rand()<0.5) ? 1 : 0) } end def path_exists?(i, j, graph) visited, stack = [], [i] while !stack.empty? return true if stack.include?(j) k = stack.shift next if visited.include?(k) visited << k graph[k][:out].each {|m| stack.unshift(m) if !visited.include?(m)} end return false end def can_add_edge?(i, j, graph) return !graph[i][:out].include?(j) && !path_exists?(j, i, graph) end def get_viable_parents(node, graph) viable = [] graph.size.times do |i| if node!=i and can_add_edge?(node, i, graph) viable << i end end return viable end def compute_count_for_edges(pop, indexes) counts = Array.new(2**(indexes.size)){0} pop.each do |p| index = 0 indexes.reverse.each_with_index do |v,i| index += ((p[:bitstring][v] == 1) ? 1 : 0) * (2**i) end counts[index] += 1 end return counts end def fact(v) return v <= 1 ? 1 : v*fact(v-1) end def k2equation(node, candidates, pop) counts = compute_count_for_edges(pop, [node]+candidates) total = nil (counts.size/2).times do |i| a1, a2 = counts[i*2], counts[(i*2)+1] rs = (1.0/fact((a1+a2)+1).to_f) * fact(a1).to_f * fact(a2).to_f total = (total.nil? ? rs : total*rs) end return total end def compute_gains(node, graph, pop, max=2) viable = get_viable_parents(node[:num], graph) gains = Array.new(graph.size) {-1.0} gains.each_index do |i| if graph[i][:in].size < max and viable.include?(i) gains[i] = k2equation(node[:num], node[:in]+[i], pop) end end return gains end def construct_network(pop, prob_size, max_edges=3*pop.size) graph = Array.new(prob_size) {|i| {:out=>[], :in=>[], :num=>i} } gains = Array.new(prob_size) max_edges.times do max, from, to = -1, nil, nil graph.each_with_index do |node, i| gains[i] = compute_gains(node, graph, pop) gains[i].each_with_index {|v,j| from,to,max = i,j,v if v>max} end break if max <= 0.0 graph[from][:out] << to graph[to][:in] << from end return graph end def topological_ordering(graph) graph.each {|n| n[:count] = n[:in].size} ordered,stack = [], graph.select {|n| n[:count]==0} while ordered.size < graph.size current = stack.shift current[:out].each do |edge| node = graph.find {|n| n[:num]==edge} node[:count] -= 1 stack << node if node[:count] <= 0 end ordered << current end return ordered end def marginal_probability(i, pop) return pop.inject(0.0){|s,x| s + x[:bitstring][i]} / pop.size.to_f end def calculate_probability(node, bitstring, graph, pop) return marginal_probability(node[:num], pop) if node[:in].empty? counts = compute_count_for_edges(pop, [node[:num]]+node[:in]) index = 0 node[:in].reverse.each_with_index do |v,i| index += ((bitstring[v] == 1) ? 1 : 0) * (2**i) end i1 = index + (1*2**(node[:in].size)) i2 = index + (0*2**(node[:in].size)) a1, a2 = counts[i1].to_f, counts[i2].to_f return a1/(a1+a2) end def probabilistic_logic_sample(graph, pop) bitstring = Array.new(graph.size) graph.each do |node| prob = calculate_probability(node, bitstring, graph, pop) bitstring[node[:num]] = ((rand() < prob) ? 1 : 0) end return {:bitstring=>bitstring} end def sample_from_network(pop, graph, num_samples) ordered = topological_ordering(graph) samples = Array.new(num_samples) do probabilistic_logic_sample(ordered, pop) end return samples end def search(num_bits, max_iter, pop_size, select_size, num_children) pop = Array.new(pop_size) { {:bitstring=>random_bitstring(num_bits)} } pop.each{|c| c[:cost] = onemax(c[:bitstring])} best = pop.sort!{|x,y| y[:cost] <=> x[:cost]}.first max_iter.times do |it| selected = pop.first(select_size) network = construct_network(selected, num_bits) arcs = network.inject(0){|s,x| s+x[:out].size} children = sample_from_network(selected, network, num_children) children.each{|c| c[:cost] = onemax(c[:bitstring])} children.each {|c| puts " >>sample, f=#{c[:cost]} #{c[:bitstring]}"} pop = pop[0...(pop_size-select_size)] + children pop.sort! {|x,y| y[:cost] <=> x[:cost]} best = pop.first if pop.first[:cost] >= best[:cost] puts " >it=#{it}, arcs=#{arcs}, f=#{best[:cost]}, [#{best[:bitstring]}]" converged = pop.select {|x| x[:bitstring]!=pop.first[:bitstring]}.empty? break if converged or best[:cost]==num_bits end return best end if __FILE__ == $0 # problem configuration num_bits = 20 # algorithm configuration max_iter = 100 pop_size = 50 select_size = 15 num_children = 25 # execute the algorithm best = search(num_bits, max_iter, pop_size, select_size, num_children) puts "done! Solution: f=#{best[:cost]}/#{num_bits}, s=#{best[:bitstring]}" end
The Bayesian Optimization Algorithm was proposed by Pelikan, Goldberg, and Cantú-Paz in the technical report [Pelikan1998a], that was later published [Pelikan2002]. The technique was proposed as an extension to the state of Estimation of Distribution algorithms (such as the Univariate Marginal Distribution Algorithm and the Bivariate Marginal Distribution Algorithm) that used a Bayesian Network to model the relationships and conditional probabilities for the components expressed in a population of fit candidate solutions. Pelikan, Goldberg, and Cantú-Paz also described the approach applied to deceptive binary optimization problems (trap functions) in a paper that was published before the seminal journal article [Pelikan1999a].
Pelikan and Goldberg described an extension to the approach called the Hierarchical Bayesian Optimization Algorithm (hBOA) [Pelikan2000] [Pelikan2001b]. The differences in the hBOA algorithm are that it replaces the decision tables (used to store the probabilities) with decision graphs and used a niching method called Restricted Tournament Replacement to maintain diversity in the selected set of candidate solutions used to construct the network models. Pelikan's work on BOA culminated in his PhD thesis that provides a detailed treatment of the approach, its configuration and application [Pelikan2002a]. Pelikan, Sastry, and Goldberg proposed the Incremental Bayesian Optimization Algorithm (iBOA) extension of the approach that removes the population and adds incremental updates to the Bayesian network [Pelikan2008].
Pelikan published a book that focused on the technique, walking through the development of probabilistic algorithms inspired by evolutionary computation, a detailed look at the Bayesian Optimization Algorithm (Chapter 3), the hierarchic extension to Hierarchical Bayesian Optimization Algorithm and demonstration studies of the approach on test problems [Pelikan2005].
[Cooper1992] | G. F. Cooper and E. Herskovits, "A Bayesian Method for the Induction of Probabilistic Networks from\n\tData", Machine Learning, 1992. |
[Henrion1988] | M. Henrion, "Propagation of Uncertainty by Probabilistic Logic Sampling in Bayes\n\tNetworks", in Uncertainty in Artificial Intelligence 2, pages 149–163, Elsevier Science Publishing Company, Inc., 1988. |
[Pelikan1998a] | M. Pelikan and D. E. Goldberg and E. Cantú–Paz, "Linkage Problem, Distribution Estimation, and Bayesian Networks", Technical Report IlliGAL Report No. 98013, llinois Genetic Algorithms Laboratory, University of Illinois at\n\tUrbana-Champaign, 1998. |
[Pelikan1999a] | M. Pelikan and D. E. Goldberg and E. Cantú–Paz, "BOA: The Bayesian optimization algorithm.", in Proceedings of the Genetic and Evolutionary Computation Conference\n\t(GECCO-99), 1999. |
[Pelikan1999b] | M. Pelikan, "A Simple Implementation of the Bayesian Optimization Algorithm (BOA)\n\tin C++ (version 1.0)", Technical Report IlliGAL Report No. 99011, University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms\n\tLaboratory, Urbana, IL, 1999. |
[Pelikan2000] | M. Pelikan and D. E. Goldberg, "Hierarchical Problem Solving and the Bayesian Optimization Algorithms", in Genetic and Evolutionary Computation Conference 2000 (GECCO-2000), 2000. |
[Pelikan2001b] | M. Pelikan and D. E. Goldberg, "Escaping hierarchical traps with competent genetic algorithms", in Proceedings of the Genetic and Evolutionary Computation Conference\n\t(GECCO-2001), 2001. |
[Pelikan2002] | M. Pelikan and D. E. Goldberg and E. Cantú–Paz, "Linkage Problem, Distribution Estimation, and Bayesian Networks", Evolutionary Computation, 2002. |
[Pelikan2002a] | M. Pelikan, "Bayesian optimization algorithm: From single level to hierarchy", [PhD Thesis] University of Illinois at Urbana-Champaign, 2002. |
[Pelikan2005] | M. Pelikan, "Hierarchical Bayesian Optimization Algorithm: Toward a New Generation\n\tof Evolutionary Algorithms", Springer, 2005. |
[Pelikan2008] | M. Pelikan and K. Sastry and D. E. Goldberg, "iBOA: The Incremental Bayesian Optimization Algorithms", in Proceedings of the Genetic and Evolutionary Computation Conference\n\t(GECCO-2008), 2008. |
Please Note: This content was automatically generated from the book content and may contain minor differences.