Home | Read Online | Amazon | GoodReads | Google Books | PDF (code) | GitHub
Perceptron.
The Perceptron algorithm belongs to the field of Artificial Neural Networks and more broadly Computational Intelligence. It is a single layer feedforward neural network (single cell network) that inspired many extensions and variants, not limited to ADALINE and the Widrow-Hoff learning rules.
The Perceptron is inspired by the information processing of a single neural cell (called a neuron). A neuron accepts input signals via its dendrites, which pass the electrical signal down to the cell body. The axon carry the signal out to synapses, which are the connections of a cell's axon to other cell's dendrites. In a synapse, the electrical activity is converted into molecular activity (neurotransmitter molecules crossing the synaptic cleft and binding with receptors). The molecular binding develops an electrical signal which is passed onto the connected cells dendrites.
The information processing objective of the technique is to model a given function by modifying internal weightings of input signals to produce an expected output signal. The system is trained using a supervised learning method, where the error between the system's output and a known expected output is presented to the system and used to modify its internal state. State is maintained in a set of weightings on the input signals. The weights are used to represent an abstraction of the mapping of input vectors to the output signal for the examples that the system was exposed to during training.
The Perceptron is comprised of a data structure (weights) and separate procedures for training and applying the structure. The structure is really just a vector of weights (one for each expected input) and a bias term.
Algorithm (below) provides a pseudocode for training the Perceptron. A weight is initialized for each input plus an additional weight for a fixed bias constant input that is almost always set to 1.0. The activation of the network to a given input pattern is calculated as follows:
$activation \leftarrow \sum_{k=1}^{n}\big( w_{k} \times x_{ki}\big) + w_{bias} \times 1.0$where $n$ is the number of weights and inputs, $x_{ki}$ is the $k^{th}$ attribute on the $i^{th}$ input pattern, and $w_{bias}$ is the bias weight. The weights are updated as follows:
$w_{i}(t+1) = w_{i}(t) + \alpha \times (e(t)-a(t)) \times x_{i}(t)$where $w_i$ is the $i^{th}$ weight at time $t$ and $t+1$, $\alpha$ is the learning rate, $e(t)$ and $a(t)$ are the expected and actual output at time $t$, and $x_i$ is the $i^{th}$ input. This update process is applied to each weight in turn (as well as the bias weight with its contact input).
Input
:
ProblemSize
, InputPatterns
, $iterations_{max}$, $learn_{rate}$
Output
:
Weights
Weights
$\leftarrow$ InitializeWeights
{ProblemSize
}For
($i=1$ To
$iterations_{max}$)SelectInputPattern
{InputPatterns
}ActivateNetwork
{$Pattern_i$, Weights
}TransferActivation
{$Activation_i$}UpdateWeights
{$Pattern_i$, $Output_i$, $learn_{rate}$}End
Return
(Weights
)Listing (below) provides an example of the Perceptron algorithm implemented in the Ruby Programming Language. The problem is the classical OR boolean problem, where the inputs of the boolean truth table are provided as the two inputs and the result of the boolean OR operation is expected as output.
The algorithm was implemented using an online learning method, meaning the weights are updated after each input pattern is observed. A step transfer function is used to convert the activation into a binary output $\in{0,1}$. Random samples are taken from the domain to train the weights, and similarly, random samples are drawn from the domain to demonstrate what the network has learned. A bias weight is used for stability with a constant input of 1.0.
def random_vector(minmax) return Array.new(minmax.size) do |i| minmax[i][0] + ((minmax[i][1] - minmax[i][0]) * rand()) end end def initialize_weights(problem_size) minmax = Array.new(problem_size + 1) {[-1.0,1.0]} return random_vector(minmax) end def update_weights(num_inputs, weights, input, out_exp, out_act, l_rate) num_inputs.times do |i| weights[i] += l_rate * (out_exp - out_act) * input[i] end weights[num_inputs] += l_rate * (out_exp - out_act) * 1.0 end def activate(weights, vector) sum = weights[weights.size-1] * 1.0 vector.each_with_index do |input, i| sum += weights[i] * input end return sum end def transfer(activation) return (activation >= 0) ? 1.0 : 0.0 end def get_output(weights, vector) activation = activate(weights, vector) return transfer(activation) end def train_weights(weights, domain, num_inputs, iterations, lrate) iterations.times do |epoch| error = 0.0 domain.each do |pattern| input = Array.new(num_inputs) {|k| pattern[k].to_f} output = get_output(weights, input) expected = pattern.last.to_f error += (output - expected).abs update_weights(num_inputs, weights, input, expected, output, lrate) end puts "> epoch=#{epoch}, error=#{error}" end end def test_weights(weights, domain, num_inputs) correct = 0 domain.each do |pattern| input_vector = Array.new(num_inputs) {|k| pattern[k].to_f} output = get_output(weights, input_vector) correct += 1 if output.round == pattern.last end puts "Finished test with a score of #{correct}/#{domain.size}" return correct end def execute(domain, num_inputs, iterations, learning_rate) weights = initialize_weights(num_inputs) train_weights(weights, domain, num_inputs, iterations, learning_rate) test_weights(weights, domain, num_inputs) return weights end if __FILE__ == $0 # problem configuration or_problem = [[0,0,0], [0,1,1], [1,0,1], [1,1,1]] inputs = 2 # algorithm configuration iterations = 20 learning_rate = 0.1 # execute the algorithm execute(or_problem, inputs, iterations, learning_rate) end
The Perceptron algorithm was proposed by Rosenblatt in 1958 [Rosenblatt1958]. Rosenblatt proposed a range of neural network structures and methods. The 'Perceptron' as it is known is in fact a simplification of Rosenblatt's models by Minsky and Papert for the purposes of analysis [Minsky1969]. An early proof of convergence was provided by Novikoff [Novikoff1962].
Minsky and Papert wrote the classical text titled "Perceptrons" in 1969 that is known to have discredited the approach, suggesting it was limited to linear discrimination, which reduced research in the area for decades afterward [Minsky1969].
[Minsky1969] | M. L. Minsky and S. A. Papert, "Perceptrons", MIT Press, 1969. |
[Novikoff1962] | A. B. Novikoff, "On convergence proofs on perceptrons", Symposium on the Mathematical Theory of Automata, 1962. |
[Rosenblatt1958] | F. Rosenblatt, "The Perceptron: A Probabilistic Model for Information Storage and\n\tOrganization in the Brain", Cornell Aeronautical Laboratory, Psychological Review, 1958. |
Please Note: This content was automatically generated from the book content and may contain minor differences.