A R
m×n
, D {-1, +1}
m×1
, e = 1
m×1
Fung and Mangasarian [13] replaced the inequality constraint in (1) with an
equality constraint. This changed the binary classification problem, because the
points in Fig. 1(b) are no longer bounded by the planes, but are clustered around
them. By solving the equation for y and inserting the result into the expression
to be minimized, one gets the following unconstrained optimization problem:
min
(w,)R
n+1+m
f (w, ) =
2
D(Aw - e) - e
2
+
1
2
(w w +
2
)
(2)
Setting
f =
f
w
,
f
= 0 one gets:
w
X
=
A A +
I
-A e
-e A
1
+ m
-1
A De
-e De
=
I
+ E E
-1
A
-1
E De
B
(3)
E = [A - e], E R
m×(n+1)
Agarwal has showed that the Proximal SVM is directly transferable to a ridge
regression expression [14]. Fung and Mangasarian [6] later showed that (3) can
be rewritten to handle increments (E
i
, d
i
) and decrements (E
d
, d
d
), as shown
in (4). This decremental approach is based on time windows.
X =
w
=
I
+ E E + (E
i
) E
i
- (E
d
) E
d
-1
E d + (E
i
) d
i
- (E
d
) d
d
,
(4)
where d = De
.
3
Parallelization of the Incremental Proximal SVM
The incremental capabilities of Proximal SVM allows us to efficiently calculate
increments ((E
i
) E
i
, (E
i
) d
i
) in parallel.
The increments have the dimension numfeatures
2
, compared to numfeatures·numexamples
for E (feature dimension will also be limited due to the curse of dimensional-
ity), in practice is numfeatures
2
<< numfeatures · numexamples. The fact that
these increments are relatively small makes them network-wise cheap to send to
a parent node for accumulation into (E E, E De), before finally calculating X
on the top node.
128
Parallel and Incremental PSVM Classifiers