Neural Networks: Processing, History, and Learning Methods

Definition of Neuron

The elemental processing unit of a neural network, which generates an output as a result of the weighted sum of inputs to which an activation function is applied. Its function is a simple device: minimal storage (only its weights) and small computing capacity (weighted sums and output function).

How a Neuron Processes Information

Weighted Sum: The neuron integrates all its entries to calculate its net inflow, expressed as the sum of the product of each entry by the strength of connection of that input.

Activation Function: A function that is applied to the result of a weighted sum of the neuron to determine the output. The different activation functions are the step function, sign, sigmoid, and hyperbolic tangent.

Recurrent Neural Networks (RNN) History

Early Work: 1950s

Decade of the 40s: McCulloch & Pitts: Definition of learning is Hebb formal. Neuron: synaptic learning rule.

Beginnings: 1960s

Rosenblatt: two-layer Perceptron (modeling the retina). McCulloch neuron. Hebbian learning. It is sensed that it is only suitable for linearly separable problems.

Minsky and Papert: Perceptron convergence theorem: if the training patterns are linearly separable, learning converges in a finite number of steps. If the training patterns are not linearly separable, a perceptron cannot learn. Research in RNN paused.

Transition

Von der Malsburg 1970s: the concept of competitive learning, which is based on Kohonen later. Hopfield: End of the dark ages. Hopfield Network: Red-memory symmetric autoassociative.

Risorgimento 1980s

Kohonen developed his model of self-organizing map (SOM) based on the work of von der Malsburg. Rumelhart, Hinton, and Williams published the generalized delta rule and applied it to an improved perceptron Ronsemblatt: Multilayer perceptron.

Concept of Classification and Clustering

Definition of classification method: Divide the N-dimensional space to which vectors belong, defined by N characteristics of each individual in the population, into K exclusive regions corresponding to the K possible groups.

Clustering: Find classes where the classes are unknown.

Sort: Categorize an element within existing classes.

Definition of SOM

Feedforward Neural Network. Nonlinear. Trained. Unsupervised learning. A matrix of neurons that receives inputs from a population of elements and evaluates a simple discriminant function from them. A mechanism comparing the discriminant functions and selects the unit with the largest value of the function. Some local interaction to activate simultaneously neighboring units. A selected drive and the adaptive process whereby the parameters of the active units increase their discriminant functions in relation to the current entry.

Characteristics of a Kohonen Network

System is not completely hierarchical in 2 interconnected layers: Input for input and output vector or Kohonen, each connected to all inputs. They function by competition: a mechanism for which only one neuron is activated each time. Neurons compete to see which better recognizes the neuron pattern. Preserved in two dimensions, the order n-dimensional patterns are recognized by neurons that appear close in two-dimensional space.

Processing of a Kohonen Network

This is a forward (feedforward) process competing in three stages:

Propagation of entry: In the input layer, the value of each component of the input vector corresponding to the neuron is assigned.

Calculation of Euclidean distance: We multiply the input value of the neurons in the input layer by weights that link each neuron in the Kohonen layer and sum. Alternatively, the Euclidean distance is calculated.

Lateral inhibition or competition: The neuron that was active only with the highest dot product, which is closest to the input vector, inhibits the others.

Learning Process of Kohonen Network

Initially, the weight matrix has random values. During the training, vectors are adapted weights of neurons in the Kohonen layer to the input patterns. The weight vector of the neuron closest to the input pattern is close to this as a learning function.

Law of Learning a Kohonen map: Adjusting the weight matrix

From a mathematical point of view, the density function of weight vectors tends to approximate the probabilistic density function of the input vectors.

Preservation of spatial order in Autoassociative Maps

Using the concept of neighborhood. Learning extends to adjacent neurons by a neighborhood value, which varies with time. This network includes its specialized neurons in topological maps in the recognition of a type of pattern.

Time-varying parameters in the SOM

Learning coefficient and Neighborhood.

Why is there no danger of overtraining a Kohonen network?

Kohonen network approaches a final steady state to be a topological mechanism which will find a center or point of compromise between recognizing patterns by reducing the rate of learning.

Building an Unsupervised classifier based on a Kohonen map

In three steps:

Experimental conditions

Presentation of random samples, random initial matrix of weights, weights and patterns normalized to the Euclidean norm, and a sufficient number of data.

Selection parameters

Size of Kohonen layer, number of presentations, Boost, Form function of decreasing n, the initial Neighborhood, etc.

Analysis of Dynamic classifier

Learning process. Classification accuracy.

Hopfield Network Definition (Associative Memory)

Neural network feedback. Nonlinear. Built. Unsupervised. N neurons interconnected with each other that are both input and output. Bipolar neurons with McCulloch and Pitts sign activation function. Bidirectional connections with symmetric weight matrix with zero diagonal.

Characteristics of a Hopfield network

A network stores a set of input vectors through learning and is able to recover one of these vectors from a partial or distorted pattern. Associative memory access generated by a minimum local energy function of the network, to which it converges. There is a maximum memory capacity.

Processing in a Hopfield network

Iteration on the outputs used as inputs in the next round. The state of a neuron depends on previous values of itself. Hopfield showed that when the weight matrix is symmetric, the network evolves over time to reach a stable state in which the states of neurons do not change. Convergence to the stored vector closest to the initial state.

Training a Hopfield network

Training is to build the weight matrix so that some of the states of the network are stable.

History of PMC

  • First linear model proposed by Rosenblatt. 2 layers, binary neurons, a linear activation function, and a delta rule learning.
  • Explain the Perceptron Convergence Theorem.
  • Rumelhart, Hinton, and Williams published the algorithm of back-propagation learning (Generalized Delta Rule) and apply it to an improvement of Rosenblatt perceptron: Multilayer Perceptron.

Definition of Feedforward Multilayer Perceptron

Neural Network. Nonlinear. Trained. Supervised learning and improvement of Rosemblatt Perceptron. Consists of at least three layers (input, output, and one or more intermediate hidden layers). Each layer is fully connected to the next with a weight matrix. The weights are adjusted during training.

Definition of Supervised Learning

The process of training in which the weight matrices are changed according to a law of learning so that the actual output is obtained as expected. Training: iterative process that requires a set of training data for which the expected output is known, sufficiently broad to cover the sample space. Law of Learning: mathematical formulation of adjustment weight matrix based on peer input/output are trained to reduce error. Objective: Minimizing error in the training set.

How do I learn a PMC?

Learning is to adjust the weight matrix of the network to obtain the desired output from the input. The adjustment of the weight matrix is performed by backpropagating the error obtained for each training event and using it to adjust each weight matrix using the Generalized Delta Rule. First, update the weight matrix connecting the output layer to the hidden layer and then the weight matrix connecting the hidden layer to the input layer. The entire network learns as a global learning.

Learning process of the PMC

The learning process of a multilayer perceptron has three phases:

Propagation of the inputs to outputs:

* We report the p-th vector xp input to the input layer .* It propagates the input vector to the middle layer .* It calculates the output neurons of the middle layer as the sigmoid function * Repeat with the output layer to calculate entry and exit Ik Ok, that is the output of the network. Calcutta] or the error to the input vector xp actual output is compared with the expected obtained by calculating the minimum square error for all neurons in the output layer. Correction of the weight matrix of the error propagation from the output layer to input: * Update weight matrix connecting the output layer to hidden layer * Proceed as in the previous step, except that in this case as it is not known in advance the output of the hidden layer, using a mathematical transformation of the error generated by the output layer, the hidden layer retropropagandolo. Learning Act: Generalized Delta Rule Generalized Delta Rule is the iterative algorithm that adjusts the weights of the network trying to minimize the mean square error between expected outputs and those produced by the network. The weight matrix that was desired is equal to the current weight matrix over a variation of the matrix. This variation of the matrix is decomposed in * term correction of the error. * Term of inertia of motion.Overtraining Memory / Generalization / Overtraining An insufficient or excessive training of the network makes it not to generalize. Overtraining is the loss of the ability of generalization of a network, replacing the generalizability of the memorization of the test suite. Cross-Validation Method: Method Jack Knife. Is to separate the training dataset a validation data set we use to detect the error occurs. When you stop the learning process to avoid over-training, the learning process must finish before the network begin to memorize and do not generalize. The training stops when the error occurs with the validation set stops decreasing. Characteristics of the training set The training set determines what you learn the network: * 2 or 3 times the number of weights. * You must cover the whole space responses. * Conveniently selected