Final Exam, ECE 547 Fall 1998
This exam is take-home and open book, but obviously it is to be done
individually and not in teams, and you should not discuss the problems
with anyone else. It is due (either by email or physical paper) by
5pm Thursday December 17.
Answer eight of the following nine questions (12.5 points each.)
- What is the difference between Rosenblatt's Perceptron
Learning Rule and backpropagation applied to a single
linearity-followed-by-a-sigmoid unit? (Briefly, please!)
- What are the disadvantages (and symptoms) of using too many hidden
units when trying to train a network using a fixed training set? What
are the disadvantages (and symptoms) of using too few?
- If your error measure is the Kulback-Liebler divergence, and the
target distribution (over just four possibilities) is p=(0.1 0.1 0.4
0.4), which would have lower error, q1=(0.001 0.001 0.499 0.499) or
q2=(0.25 0.25 0.25 0.25)?
- Calculate the gradient dE/dw=(dE/dw1 dE/dw2 ...) of the ``vanilla
backpropation network'' shown below, with a single input pattern (0.7
0) and a target output of 0.9. The extra incoming arrow are for
biases. Show all intermediate calculations.
- What good are hidden units? Why don't support vector machines
- What does the Q in Q-learning stand for? What advantages does
Q-learning have over just estimating V(state), the value of individual
states? (Hint: off-policy.)
- When a Boltzmann Machine learns, what is it trying to do?
- Boltzmann Machines use stochastic binary units. Why?
- Describe three methods for helping networks generalize better.
For each give an example of where it might be appropriate.