RENT A THINKER 



Home  My Page  Chat  TikToke 
6.1
ABSTRACT
6.2
INTRODUCTION
6.3
STOCHASTIC TYPE MULTI LAYER ARTIFICIAL NEURAL NETWORK
6.4
Proposed algorithm
6.5
CHOICE OF HIGHER FUNCTIONAL BLOCKS
6.6
Pretrained MLP (MultiLayer Perceptrons) block
6.7
Look up table
6.8
Time delay neuron
6.9
Functional neuron
6.10
PRUNING ALGORITHM
6.11
RESULTS
6.12
CONCLUSION
6.13
REFERENCES
6.14
PUBLISHED
A large
randomly connected Feed Forward type Neural Network (FFNN) is designed to solve
a class of problems. This network is used to test and simulate the possibility
of hybrid of FFNN and other types of computing elements. The higher functions
building blocks are simulated as artificial neurons and are implanted in the
above network. The composite network is trained and pruned. As a result, it is
observed that, the original network adopts the implanted building blocks and the
total size of the network reduces considerably. The suitable reverse
characteristics of each functional block are designed to implement training
using gradient descendent technique. Some of the building blocks are designed
using stochastic type logic. Such building block’s output is similar to the
output of the biological neuron. This work is carried out in the view of future
possibility of bioelectronic (hybrid) systems.
The objective of our work is to demonstrate the possibility
of implantation of nonneural computing devices in the simulated randomly
connected Artificial Neural Network (ANN). To test the above possibility, a
large randomly connected feed forward type host ANN is designed using a
generalpurpose neural network tool box [6].
Small independent functional building blocks are designed to simulate artificial
neurons. Some of the hidden neurons of the host network is chosen at random and
is replaced with above functional blocks. The host network is then trained to
solve a specified problem using supervisory learning algorithm. The composite
network is then pruned [2][3] to
optimize the size of the network. The resultant network is a reduced size hybrid
of neural and other type of building blocks. It is observed that the implanted
functional blocks that were most appropriate to the problem, were adopted by the
host network. Such hybrid system
allows computer simulation of large ANN architecture with faster response time.
There are three main designing aspects of the above type of network as described
below 
·
the stochastic type ANN design,
·
choice of higher function building blocks
and
·
pruning algorithm to optimize the network size.
The following section describes the above aspects in detail.
The section II of this paper describes the stochastic [1] learning algorithm developed for VLSI implementation [5].
The algorithm is similar to the Back Propagation algorithm but can use binary
neurons at hidden nodes. Such neuron’s outputs is probabilistic in nature and resembles output of biological neuron. Section III describes
the different types of higher functional building blocks. Using these functional
blocks, different applications are developed. Some of the results are given in
section V. The Section IV describes the pruning algorithm. An iterative pruning
algorithm is developed, which optimizes the network size. In many examples like
XOR, circle mapping etc., this algorithm optimizes the network size to the
theoretical minimum limit. The section V describes the results of different
experiments conducted as mentioned in section II, III and IV. Finally, section
VI gives conclusion and useful references are listed at the end.
We first review briefly the back propagation (BP)
technique [7] that is popularly used for training FFNNs. The inputoutput
characteristic function (activation function) of a neuron is chosen such that it
is continuous and its first derivative is bounded between definite limits for
large dynamic range of input. Consider a three layer network of units Y_{i},
Y_{j} and Y_{k} which are interconnected through weights w_{ji}
and w_{kj} such that
Xj =
å_{i}
wji * Yi;
Yj
= f(Xj); (second layer)
Xk
= å_{j}
wkj * Yj;
Yk
= f(Xk); (third layer)
Here Y = f(X)
represents the neuron activation function of the form f(X) = 1/(1+e^{(X)}).
The weights wkj
and wji are modified using
the BP algorithm as follows
wkj = wkj
+ h
* Yj * (Ykdk)
* {Yk * (1Yk)}
....(1)
wji = wji
+ h
*Yi *{Yj
*(1Yj)} * Sk
[(Ykdk)*{Yk
*(1Yk)}* wkj]
....(2)
where h is a real
constant less than 1.0 and dk
is the desired output. Theoretically, a threelayer network is sufficient for
mapping any given inputoutput set. For reasons of efficiency, however, it may
be necessary to consider networks with more than 3 layers. It is straightforward
to modify the above equations for the applications having more than 3 layers.
All the variables except
Yi
and dk
of equation (1) and (2) are real
variables and hence difficult to implement in VLSI for parallel processing of a
large number of perceptrons. It has been shown that the choice of perceptron’s
characteristic [4] is not very
critical for convergence of error during training. This paper suggests one such
choice that reduces the computational requirement drastically without effecting
the error convergence property during training.
We will analyze the significance of each factor in expressions
(1) and (2) in order to
replace them by suitable probability functions
[Ykdk]
is the departure of output Yk
from the desired value dk.
[Yk
* (1Yk)]
is the measure of willingness of K^{th} output neuron to learn. This is
measured by its nearness from its average value (0.5). This factor plays an
important role in training.
[w_{kj
} ]
is the back propagation
signal path’s conductance.
[Yj
* (1Yj)] is similar to the
above term {Yk * (1Yk
)} for hidden layer neuron
[Yi] is the excitation potential for ‘w_{ji}’.
In the proposed system the perceptron’s forward pass
transfer characteristic is generated by a threshold detector having a random
bias value as shown in figure1.
This is equivalent to a choice of inputoutput characteristic function of
equation (3).
Y = 1 for (X + r) > 0; and
Y = 0 for (X + r) <= 0 ;
...(3)
Here r is a random
integer inside the dynamic range of Xj.
Yj and Yk
are calculated using the above equation
(3) in the forward pass. This has two advantages: the value of Yj
is binary but is statistically continuous in the dynamic range of learning and
the computations of Xk
(for the next layer) does not need any multiplication.
In the backward pass, a term of the form Yk
* (1Yk) is desirable. This term increases the sensitivity of wkj
near the threshold value of Xk.
Equivalently, the network’s willingness to learn or forget is inversely
proportional to the distance of Xj from the threshold. Considering this, a
simplified reverse characteristic function R(X),
as shown in figure1 and equation (4),
is used to replace the terms of the form {Y
* (1Y)} in equations (1) and (2).
R(X) = 0 for abs(X) > r; and R(x) = 1 for
abs(X) <= r;
...(4)
Here r is a
positive random integer less than N and
the dynamic range of the X for
learning is
N
to +N . The value of N is
dependent on the network configuration. To improve the dynamic range of the
X’s and the rate of convergence, one of the following methods is used 
·
Use
N as random integer;
·
Use
‘r’ as weighted random number;
·
Adjust
the value of N gradually as error converges.
Curve1 (dotted line) p=Nrandom(2*N+1)+X if p>0 then z=1 else z=0 Y=(S_{s} z_{s})/S for S samples Curve1 (solid line) Y = 1 / 1+ e^{(X)} 
Curve2 (dotted line) p=random(N)abs(X), f p>0 then z=1 else z=0 y=(S_{s} z_{s} )/S for S samples Curve2 (solid line) Y= e^{(X)} /(1+ e^{(X)})^{ 2} 
Figure  1 The forward and the
reverse characteristics of the proposed stochastic function
along with exponential
sigmiod function in solid line.
The error convergence curve of a simple 3 layer stochastic
type network is shown in figure2,
using XOR inputoutput function. Figure3(a)
and 3(b) show the output of some of the stochastic type hidden neurons in
time domain and the output of a biological neuron respectively. The figure shows
the similarity between the stochastic and biological neuron.
Different types of special type of neuron functions are
developed to increase the efficiency of training in a complex network
environment. Each of such functional blocks is equivalent to several artificial
neurons and associated connections of conventional neural circuit. The
functional blocks are implemented using dedicated hardware or microprocessor
based system. A general purpose Neural
Network Tool Box [6]
is designed to interface different types of functional building blocks with
multilayered FFNN. The Tool Box is a computer simulation program, which
supports interactive design of network topology, new activation function and
training algorithm. Some of the functions developed using above Tool Box are
described below
To reduce the training time, it is
convenient to train separate functional blocks of multilayer FFNN. These
functions are generic in nature like image band compression, time series
prediction, geometric mapping functions for circle, ellipse, polygon etc.. The
pretrained blocks transfer the BP error like normal FFNN. The elements of these
blocks are not reinforced during training.
To simulate above type of pretrained network block, the Neural
Network Tool Box is used. A portion of a large network is first trained
using XOR problem. Then the selected weights
are inhibited to reinforce the values during training. The composite
network is trained with a function, that uses XOR operation e.g.
(AÅB) OR (cÅD). The figure
7(a) and 7(b) show the network,
before and after pruning, respectively.
Figure  2
The error convergence curve of a 3 layer stochastic type Feed Forward
Neural Network.
Figure 3
Similarity between the stochastic and the biological neurons. (a) Output of
stochastic type neurons (see
text). The curves h1, h2, h3
& h4
are hidden neuron’s output
during training
for fixed
input pattern.
The Xaxis
represents iteration (@
10 samples per iteration). (b)
Output of biological neurons..
For binary inputoutput operation, boolean functions and look
up tables are used, which are integrated with a host network using stochastic
learning algorithm. This gives higher flexibility, density and speed.
Appropriate reverse characteristics are designed using stochastic type
functions.
For efficient representation of the time domain signal, time
delay neuron is introduced. The time delay neuron retains history of time series
input signals in the hidden neurons of the
network. A time delay neuron receives the input from the immediate neighboring
neuron, retains the value for ‘d’
period of integer delay and outputs like normal neuron. These neurons allow
variable unit of integer time delay as shown in the equation (5).
Y_{j}(t) = X_{j}(td_{j})
...(5)
where ‘d_{j}’
is the integer time delay of j ^{th
}delayneuron and ‘t’
is the time.
Figure
 4 The organization of the time
delay neuron.
The
total time delay T_{d} of all
the neurons in a
time delay
network is given by  T_{d}=S_{j}
d_{j}.
Figure4 shows the organization
of time delay neurons as a functional building block. There is a single input
Yi, which is delayed through different delay neurons
Yj. The output of delay neurons Yj
s are connected to Yks, the output neurons. The above building block retains the
time domain information in Yj
neurons. Using such building blocks, it is possible to generate functions of the
convolution of input signal, standard filters, time series prediction etc.. One
of such applications shown in figure5
is time series prediction. The network of figure4
is trained to predict a complex time series function.
Functional neuron is a neuron that is predefined function
(equation) which may be equivalent to several neurons and interconnections. One
of the trigonometric and algebraic functions are studied are given in equation
(6).
Xj
= ( å_{i}
(W * Yi ^{2}) ) ^{0.5
};
...(6)
The use of functional neuron drastically reduces the network
size and increases the speed of operation. These types of neurons are
appropriate for real time applications for modeling, image prediction, pattern
recognition etc. Use of functional neurons also facilitates in building
mathematical model from the observed data. This is achieved by studying the
topology and interrelationship of
functional neurons inside the network. SectionV describes the results of
circular boundary mapping using the neuron activation function of
equation (6).
The
method described above are integrated with a host network. For each method
suitable error propagation strategy is developed, to reinforce the components of
the host network. In case of the functions, which are not differentiable, the
stochastic method as described in section II, is used. The composite network is
simplified using pruning algorithm as described in section below.
A
simple but efficient technique is developed which reduces the size of the
network almost to the theoretical limit with minimum number of trials. The
method is based on the simple fact that in a trained network, useful connections
are stronger and the unwanted connections are weaker. The effect is further
enhanced by slowly reducing the strength of all the interconnections with time.
The operation is
analogous to
forgetting. When
learning and
forgetting reaches
equilibrium,
Figure
 5 (a) A fully connected FFNN configuration with a timedelay
functional block trained for time series prediction.
(b) The resultant network after training and pruning.
(c) & (d) The input signal (top trace) and the predicted
output(bottom trace) of the network (a) & (b).
the optimum configuration connections are stabilized. When
equilibrium is reached, the weakest connections are gradually eliminated using
successive approximation method, until the network’s output error is less than
or equal to a desired error limit. As a result of the reduced connections, some
neurons will have no input or will have only single inputoutput connection.
These neurons are eliminated by using appropriate rules and connections are
rerouted. The figure6 shows the
Flowchart of the pruning algorithm. The figure5,
7, 8 and 9 shows the result of pruning in different applications.
Using the above techniques, several experiments were
conducted to test the feasibility of implanting nonneural functional blocks, to
a redundantly connected large host FFNNs. In most cases, it was observed that,
the host network adopts the implanted functional blocks. The implanted
functional blocks participate in solving problems and a large portion of the
other network elements becomes redundant. Hence, the network size minimizes
considerably after pruning. Some of
the cases studied are given below
Experiment  I :
XOR problem
A fully connected multilayer FFNN is configured using two
pretrained network h1, h2. h5, P and
h3, h4, h6, Q. The network has four inputs a, b, c, d and one
output R, as
shown in
the figure7(a). The network is trained and pruned as explained in
sectionIV using the function 
R = (AÅ
B) OR (CÅ D).
After
pruning, the networks corresponding to P and Q are automatically isolated from
each other. Also the four redundant hidden neurons are eliminated as shown in figure
7(b).
Figure6
Network optimizing algorithm, where Emax is the maximum tolerable error.
Figure7
(a) The implantation
of pretrained XOR Neural Network
h1, h2, h5, P and h3, h4, h6, Q in a host network. (b) The resultant network
after pruning.
Figure8 (a)In above example let rad=((X0.5)^{2}+(Y
0.5)^{2})^{0.5} and if (rad > 0.3) then R=0 else R=1. X and Y
are random analog inputs between 0 to 1. A special neuron S (See text) is
implanted among the hidden neurons in the network. (b) The optimized network,
where all the hidden neurons are eliminated except neuron S.
ExperimentII : Non linear
Boundary Mapping
Problem
In this example, a two dimensional image mapping problem is
studied. A two layer network as shown in figure8(a)
is configured with two analog inputs, thirteen hidden neurons and a binary
output neuron. Twelve hidden neurons are having conventional exponential sigmoid
type activation function and only one hidden neuron (S)
is used with function of equation
 (7).
Xj = S_{i}
(Wij * Yi2)
and Yj
= 1/(1 +e Xj)
....(7)
The network is first trained to map a circle and then
is pruned. The pruning result is shown
in figure8(b).
The resultant network has only one hidden neuron and three connections. The only
neuron (S) left in the hidden layer after the pruning operation is the one,
which was specially implanted with the activation function of equation(7). The final network is also the minimum configuration to
map circular boundary. The inputs and output
of the network of figure 8(a) and 8(b)
are shown in figure 9(a), 9(b) and
9(c).
It is shown by the simulation that a feed forward type neural
network can adopt higher order mathematical building blocks during learning
operation. It is shown that such building blocks could use probabilistic pulse
frequency modulated signals using stochastic logic. Such implant has similar
characteristics as biological neurons. Such experiments are very primitive steps
towards direct man machine interface.
(a)
(b)
(c)
Figure9 (a)
Input training pattern for figure 8 (a)&(b). (b)Output response of network of
figure 8(a). (c) Output
response of network of figure8(b).
[1] Alspector, Allen J., R.B., Hu.V., & Satyanarayana,
S., “ Stochastic learning networks and
their implementation “. In D.Z. Anderson (Ed.), Proceedings of the IEEE
Conference on Neural Information
Processing Systems  Natural and Synthetic, New York: American Institute of
Physics, 1988, pp. 921.
[2] Baum Eric. B and Haussler David “ What Size Net Gives
Valid Generalization ?”.Neural Computation, issue of January, 1989.
[3] Karnin E. D, “A simple procedure for pruning
backpropagation trained neural network,” IEEE
Trains. Neural Networks, vol. 1, June, 1990, pp. 239244.
[4] Lippmann R.P., An introduction to computing with Neural
Nets. IEEE ASSP Magazine, April 1987, pp. 422,
[5] Mazumdar Himanshu S, “A mlutilayered feed forward
neural network suitable for VLSI implementation“, Microprocessors and
Microsystems, vol. 19, number 4, May, 1995, pp. 231234.
[6] Rawal Leena P and Mazumdar Himanshu S,
“ A Neural
Network Tool Box
using C++“,
Computer Society of India, vol. 19, number 2, August 1995, pp. 1523.
[7] Rumelhart
D.E., Hinton G.E., and Williams R.J.,” Learning Representations by
BackPropagation Errors”,Nature, Vol.323, No.9, Oct. 1986, pp. 533536.
Mazumdar Himanshu S.
and Rawal Leena P., "Simulation
of Implant of higher Functions in Randomly Connected Artificial
Neural Network", published in
the Abstracts
proceedings, main papers
XIVA. The International Conference
on Cognitive Systems 1995, Dec. 15th & 16th, New Delhi, Organized by the R
& D Center, NIIT Ltd., INDIA.