3.3
Classification
27
In Bishop [1995], regularization for neural networks is called weight decay, and
the linear model of weight decay is called jitter. Jitter is equivalent to ridge
regression.
Poggio's PhD student Rifkin showed that the PSVMC is equivalent to a RLSC,
and that Fung and Mangasarians two main contributions were 1) very fast ways
of computing the linear RLSC and 2) empirical evidence that RLSC have approx-
imately the same classification accuracy as SVMC on benchmark datasets. It was
also proved that the SVMC and RLSC have the same generalization bounds, i.e.
theoretically supporting the prior empirical evidence, Rifkin [2002]; Rifkin et al.
[2003]. Agarwal showed that PSVMC can be transformed into classification using
ridge regression, Agarwal [2002] (This is also supported by the above-mentioned
relation between ridge regression and regularization).
3.3.8
Our work on incremental PSVM classifiers
The incremental PSVMC proposed by Fung and Mangasarian [2002] showed
promising performance and efficient memory utilization; results making it suit-
able in web intelligence applications, but could it be further improved to
1. efficiently handle incremental classification with multiple categories
2. have more efficient support for decremental learning
3. be efficiently parallelizable in order to handle very large classification prob-
lems common in cyberspace services (e.g. clickstream prediction on large
web sites)
In order to deal with requirement 1 we continued the development of the in-
cremental PSVMC (i.e. RLSC) algorithms proposed by Fung and Mangasarian
[2002]. In paper F (Tveit and Hetland [2003]) we proposed memoization in order
to add efficient support for incremental multicategory classification with PSVMC.