background image
3.3
Dynamic Network Topology
3.3.1
Gnutella
Gnutella have a mainly static peer network topology, i.e.
peers can disconnect or connect to the network, but they
rarely switch places or routing tables after the initial con-
nection the network.
3.3.2
Proposed Approach
Since each peer is calculating the proximity on incoming
messages versus cached messages, the peer is able to monitor
the resemblance of traffic from different neighbour peers. So
if two neighbour peers send similar types of queries to a peer,
it can ask the peers if they wan't each others addresses, and
if both reply yes the peers establish a new connection. The
reason for doing this is to try to cluster similar peers in terms
of number of hops to improve performance and quality of
recommendations.
Mobile
Customer
Assistant
Agent
Assistant
Agent
Mobile
Customer
Assistant
Agent
Mobile
Customer
Mobile
Customer
New!
Assistant
Agent
Figure 3: Dynamic Network Topology
3.4
Ranking methods
When a potentially large set of voting vectors has been re-
turned to querying node they have to be ranked in order to
get the best recommendations on top of the list before com-
municating to the mobile user (similar to Internet search
engines). The straightforward solution would be to use col-
laborative filtering (rank based on the calculation proximity
measures and then the neighborhood), this may sound like
the same as performing traditional collaborative filtering,
but the difference is that the voting vectors to be processed
is only a relatively small subset of all voting vectors for all
users of the system, this is because of the filtering done by
peers that processed the query.
An alternative approach could be to return only the recom-
mendation and the proximity measure, e.g. if the querying
voting vector, represented as with item : vote elements, is
[10 : 0, 22 : 2, 37 : 8] and the matching vector is [10 : 0, 22 :
2, 24 : 9, 37 : 8] then only the recommendation 24 : 9 to-
gether with the proximity measure will be returned, this is
more efficient both regarding bandwidth usage and a useful
preprocessing that makes ranking into a simple sorting job
(of proximity measures).
3.5
Query Compression scheme
If voting vectors are sparse, it is more efficient to represent
them as hash tables than vectors for internal processing, but
to get more efficient compression when sending the vector
between peers one can take advantage of the binary inter-
polative compression algorithm [7].
The binary interpolative compression algorithm was devel-
oped to efficiently compress inverse files used in document
indeces in information retrieval, it utilizes the difference and
context information between sorted numeric values for com-
pression. By requiring the data to have certain properties,
it can at best compress some items to 0 bytes (depending
on position, value and neighbour values in the array). In
order to work, the algorithm requires the data to be: nu-
meric, sorted and in a known numeric range. The index
positions representing items in the vector (and hash table
keys) fulfill these requirements (e.g. ISBN numbers repre-
senting books), and can therefore gain from using binary
interpolative compression.
3.5.1
Compression example:
Original query vector represented using a hash table (item:vote
pairs)
q = {24 : 5, 37 : 8, 45 : 0}
(3)
for item 24 with rating 5, item 37 with rating 8, and item 45
with rating 0 respectively. The compressed representation
is created with
IP olComp([24, 37, 45], 3 - 1, 24, 45) : 580
(4)
(i.e. compressing the list of items, and appending the votes
for items after the separating colon)
3.6
Implementation
The system is planned to be implemented using the java
programming language.
3.6.1
Feasible Performance?
Critical issues for high performance in peer-to-peer networks
are efficient handling of requests (i.e. processing and routing
time), and network bandwidth and topology [8].
Calculating similarities using a proximity measure such as
the cosine (2) has linear algorithmic running time with re-
spect to the vector length |q|. Routing based on comparing
an incoming query vector q with the set of cached vectors C
has a run time of O(|C||q|), this can potentially be reduced
by ordering the vectors in C based on proximities (when-
ever there are few messages to handle for the peer) and use
binary search to find the closest match(es) for q in C. For
each similar vector v in C q is forwarded in v's direction (i.e.
the neighbour peer which v was received from).
Network bandwidth for each peer is not easily changed, but
the topology can be changed with a relatively low cost since
it involves a simple interaction with neighbour peers about
allowing them to be introduced to each other. However, in
order to avoid the peer-to-peer topology to become a com-
plete graph, the dynamic topology heuristic should be able
76
P2P-based Recommendations for M-Commerce

<< - < - > - >>