| 1999 Spg |
EE 380L Artificial Neural
Systems |
Unique # 14945 |
Problem 3
- ***In Stage 1 of NETLAB's RBF training, the clusters are
computed with 'errors' reported. This 'error' is actually the negative of the log
likelihood, i.e. p(x1)*p(x2)*...*p(x3). Therefore do not be alarmed if the 'error' becomes
negative. Ideally, we would want to maximize the log likelihood (or minimize the -ve log
likelihood), therefore you should run the RBF training for a few times and choose the
result corresponding to the lowest possible 'error' (most negative number) reported.
- ***Sometimes the cluster centers obtained in Stage 1
overlap, i.e. 2 (or more) centers may be very near to one another, in this case you should
NOT set the sigma of this 2 centers to the distance between them, as that will result in a
very tall/thin radial basis. The possible solutions to this are
- Recluster - redo stage 1 with different(larger) _initial_ net.wi (those are set to 1 by
default initially)
- Prune - keep just 1 of the overlapping centers and discard the rest (you will have to
modify the net structure)
- Find next nearest neighbor - set sigma to the 2nd (or the 3rd, 4th, ... etc. depending
on how many overlapping centers you get) nearest distance
- Minimum width - sort the distances from each center to all others, and make sure that
distances less than a threshold are replaced by that threshold, e.g. all sigma must be at
least 1.
- Do not use NNET's RBF because it is limited in options.
- Use NETLAB's RBF function instead. (NETLAB is installed in all of the Solaris machines
EXCEPT buffy.ece.utexas.edu, alternatively you may download a copy NETLAB. )
- NETLAB automatically fix the width of all RBFs to the largest distance between two
bases.
- In this problem you must individually reset the width (sigma) of each basis to the
distance between itself and its NEAREST
basis. After setting the width, you will have to recompute the linear weights (as shown in
the code below)
- The following code is given as an example (tin = input row vectors, tout = output row
vectors, d=# input dimensions, tsize=# of training samples):
%---------- Experiment parameters
sigma = 1; % width of centers
nhidden = 20; % # of centers
maxepochs = 50; % for clustering (finding centers)
showerr = 1; % whether to
display -Likelihood while clustering
%-------------------- RBF init
options(1) = showerr;
options(5) = 1;
options(14) = maxepochs;
net = rbf(d,nhidden,dy,'gaussian');
%-------------------- RBF train
net = rbftrain(net,options,tin',tout');
%----- reset RBF widths to sigma and recompute linear weights
%----- **** here sigma is set to all ones, but in your code you must
%----- evaluate the distance from each center, net.c(j) to the nearest
center and set
%----- net.wi(j) equals this distance
net.wi = sigma*ones(size(net.wi));
%---------- with the updated weights, we feed the inputs into the RBF and obtain the new
activations corresponding
% to each input vector. y here is obtained using
the old weights net.w2, net.b2, which was derived based on the old sigmas
% net.wi and therefore is incorrect.
[y,act] = rbffwd(net,tin');
%---------- now we recompute the new W matrix comprising net.w2 and net.b2 by doing a
pseudo inverse using the new
% activation functions.
The rows of ones correspond to bias inputs (constant activation) weighted by an extra row
in the
% W matrix (denoted here
explicitly by net.b2)
temp = pinv([act ones(tsize,1)])*tout';
net.w2 = temp(1:nhidden,:);
net.b2 = temp(nhidden+1,:);
%-------------------- RBF test/validate
touthat = rbffwd(net,tin')';
- Note that there is no concept of stopping in the RBF, the maxepochs specify the number
of epochs (i.e. passes through the data) to iterate for clustering (Stage 1). Stage 2 is a
single-step process.
Problem 5
- Use MATLAB NNET's selforg functions
- A sample code is given below
w = initsm(range,100); % initialize 100 weight
vectors whose min-max values is specified by range, in this case [-1 1]
m = nbman(10,10); % initialize
10 x 10 grid structure
plotsm(w,m);
% plot the initial SOM in 2-D
tp = [20 750 1];
% parameter : maximum
of 750 epochs
w = trainsm(w,m,sdata,tp); % train the SOM
plotsm(w,m);
% plot the trained SOM in 2-D
- Note that plotsm only plots the first 2 dimensions of the weight vector w.
- In this problem you must
- First train the SOM using ALL samples (forget about their labels), then after training
- For each data sample, find the nearest neuron on the trained SOM and plot the sample on
the location of its nearest neuron (perturbed a little), labeled with its corresponding
class 'color'
- For example, let's say I have a trained 2x2 grid in 4-D space, with each neuron's weight
vector named w00, w01, w10, w11
- Assume I have 5 labeled samples (each a 4-D vector), named x1 to x5, then
- compute the distance from x1 to each of w00, w01, w10 and w11
- find the closest neuron (smallest distance to x1), say w10
- keep a record that x1 is nearest to w10
- repeat for x2 to x5
- Each sample x has a corresponding class label (in this case, it is Iris-xxxx), then
- For the sample x1, since it is nearest to w10, we plot it on the 2-D location
(1+dx,0+dy) as a Red '+' for class 1
- The dx and dy are small random numbers (necessary for visualization because multiple
points may belong to a neuron and we cannot see more than 1 point at the same location)
- Similarly for x2, say it is nearest to w11, we plot a point on the 2-D location
(1+dx,1+dy) as a Blue '*' (if it is of class 3).
- Repeat for the other samples
- Important : In part (b) you are required to plot the sample points in grid/feature space
(NOT any 2-D projection of the original 4-D space as done in plotsm)
- In grid space, the neurons are regularly
spaced on a 10x10 grid
- On this feature space (2-D), superimpose data points (with small perturbabtions) on
their respective closest (in original space) neuron.
- The following is an artificial example of a plot in grid/feature space, where each
neuron is at a grid intersection. It shows 3 distinct clusters; in the Iris Data you may
have multiple neurons responding to one class of data:
