TY - GEN
T1 - A benchmark for betweenness centrality approximation algorithms on large graphs
AU - AlGhamdi, Ziyad
AU - Skiadopoulos, Spiros
AU - Jamour, Fuad
AU - Kalnis, Panos
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017/6/27
Y1 - 2017/6/27
N2 - Betweenness centrality quantifies the importance of graph nodes in a variety of applications including social, biological and communication networks. Its computation is very costly for large graphs; therefore, many approximate methods have been proposed. Given the lack of a golden standard, the accuracy of most approximate methods is evaluated on tiny graphs and is not guaranteed to be representative of realistic datasets that are orders of magnitude larger. In this paper, we develop BeBeCA, a benchmark for betweenness centrality approximation methods on large graphs. Specifically: (i) We generate a golden standard by deploying a parallel implementation of Brandes algorithm using 96,000 CPU cores on a supercomputer to compute exact betweenness centrality values for several large graphs with up to 126M edges. (ii) We propose an evaluation methodology to assess various aspects of approximation accuracy, such as average error and quality of node ranking. (iii) We survey a large number of existing approximation methods and compare their performance and accuracy using our benchmark. (iv) We publicly share our benchmark, which includes the golden standard exact betweenness centrality values together with the scripts that implement our evaluation methodology; for researchers to compare their own algorithms and practitioners to select the appropriate algorithm for their application and data.
AB - Betweenness centrality quantifies the importance of graph nodes in a variety of applications including social, biological and communication networks. Its computation is very costly for large graphs; therefore, many approximate methods have been proposed. Given the lack of a golden standard, the accuracy of most approximate methods is evaluated on tiny graphs and is not guaranteed to be representative of realistic datasets that are orders of magnitude larger. In this paper, we develop BeBeCA, a benchmark for betweenness centrality approximation methods on large graphs. Specifically: (i) We generate a golden standard by deploying a parallel implementation of Brandes algorithm using 96,000 CPU cores on a supercomputer to compute exact betweenness centrality values for several large graphs with up to 126M edges. (ii) We propose an evaluation methodology to assess various aspects of approximation accuracy, such as average error and quality of node ranking. (iii) We survey a large number of existing approximation methods and compare their performance and accuracy using our benchmark. (iv) We publicly share our benchmark, which includes the golden standard exact betweenness centrality values together with the scripts that implement our evaluation methodology; for researchers to compare their own algorithms and practitioners to select the appropriate algorithm for their application and data.
KW - Approximation algorithms
KW - Betweenness centrality
KW - Experimental evaluation
KW - Social networks
UR - http://www.scopus.com/inward/record.url?scp=85025709255&partnerID=8YFLogxK
U2 - 10.1145/3085504.3085510
DO - 10.1145/3085504.3085510
M3 - Conference contribution
AN - SCOPUS:85025709255
T3 - ACM International Conference Proceeding Series
BT - SSDBM 2017
PB - Association for Computing Machinery
T2 - 29th International Conference on Scientific and Statistical Database Management, SSDBM 2017
Y2 - 27 June 2017 through 29 June 2017
ER -