Abstract
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Original language | English (US) |
---|---|
Title of host publication | IJCAI 2015 - Proceedings of the 24th International Joint Conference on Artificial Intelligence |
Publisher | International Joint Conferences on Artificial Intelligence |
Pages | 2090-2096 |
Number of pages | 7 |
Volume | 2015-January |
ISBN (Electronic) | 9781577357384 |
State | Published - 2015 |
Event | 24th International Joint Conference on Artificial Intelligence, IJCAI 2015 - Buenos Aires, Argentina Duration: Jul 25 2015 → Jul 31 2015 |
Other
Other | 24th International Joint Conference on Artificial Intelligence, IJCAI 2015 |
---|---|
Country/Territory | Argentina |
City | Buenos Aires |
Period | 07/25/15 → 07/31/15 |
ASJC Scopus subject areas
- Artificial Intelligence