w
=
I
+ · E E + E
i
E
i
-
W
· E
d
E
d
-1
·
· E D + E
i
D
i
-
W
· E
d
D
d
; 0, 1]
(6)
4
Related Work
Syed et al. presented an approach for handling concept drift with SVM [2].
Their approach trains on data, and keeps only the support vectors representing
the data before (exact) training with new data and the previous support vectors.
Klinkenberg and Joachims presented a window adjustment based SVM method
for detecting and handling concept drift [13]. Cauwenberghs and Poggio proposed
an incremental and decremental SVM method based on a different approximation
than used by us [6].
5
Empirical results
In order to test and compare our suggested decremental PSVM learning ap-
proach with the existing window-based approach we created synthetic binary
classification data sets with simulated concept drift. This was created by sam-
pling feature values from a multivariate normal distribution where the covariance
matrix = I (identity matrix) and the mean vector µ was sliding linearly from
only +1 values to -1 values for the positive class case, and vice versa for the
negative class [14], as shown in algorithm 1.
Algorithm 1 simConceptDrift(nF eat, nSteps, nExP erStep, start)
Require: nF eat, nSteps, nExP erStep N and start R
Ensure: Linear stochastic drift in nSteps from start to -start
1: center = [start, . . . , start] {vector of length nF eat}
2: origcenter = center
3: for all step in {0, . . . , nSteps - 1} do
4:
for all synthExampleCount in {0, . . . , nExP erStep - 1} do
5:
sample example from multivar.gauss.dist with µ = center and
2
's = 1
6:
end for
7:
center = origcenter · (1 - 2 ·
step+1
nStep-1
) {concept drift}
8: end for
5.1
Classification Accuracy
For the small concept drift test (20000 examples with 10 features and 40 in-
crements of 500 examples, figure 2(a)), the weight decay of = 0.1 performs
slightly better in terms of unlearning than a window size of W = 5, and a weight
Paper G
119