Real datasets for file-sharing peer-to-peer systems

Shen Tat Goh*, Panos Kalnis, Spiridon Bakiras, Kian Lee Tan

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

13 Scopus citations

Abstract

The fundamental drawback of unstructured peer-to-peer (P2P) networks is the flooding-based query processing protocol that seriously limits their scalability. As a result, a significant amount of research work has focused on designing efficient search protocols that reduce the overall communication cost. What is lacking, however, is the availability of real data, regarding the exact content of users' libraries and the queries that these users ask. Using trace-driven simulations will clearly generate more meaningful results and further illustrate the efficiency of a generic query processing protocol under a real-life scenario. Motivated by this fact, we developed a Gnutella-style probe and collected detailed data over a period of two months. They involve around 4,500 users and contain the exact files shared by each user, together with any available metadata (e.g., artist for songs) and information about the nodes (e.g., connection speed). We also collected the queries initiated by these users. After filtering, the data were organized in XML format and are available to researchers. Here, we analyze this dataset and present its statistical characteristics. Additionally, as a case study, we employ it to evaluate two recently proposed P2P searching techniques.

Original languageEnglish (US)
Pages (from-to)201-213
Number of pages13
JournalLecture Notes in Computer Science
Volume3453
DOIs
StatePublished - 2005
Externally publishedYes
Event10th International Conference on Database Systems for Advanced Applications, DASFAA 2005 - Beijing, China
Duration: Apr 17 2005Apr 20 2005

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Real datasets for file-sharing peer-to-peer systems'. Together they form a unique fingerprint.

Cite this