Electronic International Standard Serial Number (EISSN)
The last years have brought a dramatic increase in the popularity of collaborative Web 2.0 sites. According to recent evaluations, this phenomenon accounts for a large share of Internet traffic and significantly augments the load on the end-servers of Web 2.0 sites. In this paper, we show how collaborative classifications extracted from Web 2.0-like sites can be leveraged in the design of a self-organizing peer-to-peer network in order to distribute data in a scalable manner while preserving a high-content locality. We propose Affinity P2P (AP2P), a novel cluster-based locality-aware self-organizing peer-to-peer network. AP2P self-organizes in order to improve content locality using a novel affinity-based metric for estimating the distance between clusters of nodes sharing similar content. Searches in AP2P are directed to the cluster of interests, where a logarithmic-time parallel flooding algorithm provides high recall, low latency, and low communication overhead. The order of clusters is periodically changed using a greedy cluster placement algorithm, which reorganizes clusters based on affinity in order to increase the locality of related content. The experimental and analytical results demonstrate that the locality-aware cluster-based organization of content offers substantial benefits, achieving an average latency improvement of 45%, and up to 12% increase in search recall.