3.3
Classification
25
Ugly Duckling Theorem
According to the Ugly Duckling Theorem by Duda et al. [2001], there exists no
problem-independent way of determining the best set of features. Or in other
words: there exist no optimal feature extraction and discretization methods cov-
ering all problems.
3.3.6
Classifiers for Web Intelligence
Systems and services on the Internet continuously generate large amounts of log
data, e.g. web servers generate usage logs with click-stream data representing
the behavior of online users. If classifiers will be used to handle such data they
need to be scalable and incremental.
The process of predicting behavior based on web usage logs can be considered
a classification problem. If the behavior is relatively predictable, it can be used
to in-advance personalization of web content based on the user's preferences or
pre-fetching of web content to the user's browser cache for improved browsing
performance. Why is this a classification problem? The next click to an HTML
page or multimedia object in a user session can be seen as a class, and the previous
pages or multimedia objects visited can be seen as a feature vector. In order to
get high accuracy of next-click classification, it has shown to be beneficial to
utilize the inherent graph structure of web sites and have one classifier per web
page (paper I).
Requirements for Classifiers in Web Intelligence
Classifiers for Web Intelligence purposes need to have the following properties:
Scalable - Scale well in terms of relatively fast handling large amounts of data
by efficiently utilizing computational resources (memory and CPUs). Both
the training and classification process should be scalable.
Incremental - Support incremental (on-line) training since it doesn't have enough
time to re-train with old data every time new data arrives
Accurate - Provide (in general) high accuracy on the selected problem
Decremental - Can handle drifting concepts since concepts/classes in Web In-
telligence applications are not likely to be static over time
Handle non-orthogonal examples - Non-orthogonality is not well handled
by some classifiers (i.e. repeated occurences of training data), in a Web
Intelligence this is likely to occur and must be properly handled by the
classifier algorithm.