TY - JOUR
T1 - SNOD: a fast sampling method of exploring node orbit degrees for large graphs
AU - Wang, Pinghui
AU - Zhao, Junzhou
AU - Zhang, Xiangliang
AU - Tao, Jing
AU - Guan, Xiaohong
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: The authors wish to thank the anonymous reviewers for their helpful feedback. The research presented in this paper is supported in part by National Key R&D Program of China (2018YFC0830500), National Natural Science Foundation of China (U1301254, 61603290, 61602371), the Ministry of Education&China Mobile Research Fund (MCM20160311), 111 International Collaboration Program of China, China Postdoctoral Science Foundation (2015M582663), Shenzhen Basic Research Grant (JCYJ20160229195940462, JCYJ20170816100819428), Natural Science Basic Research Plan in Shaanxi Province of China (2016JQ6034).
PY - 2018/12/13
Y1 - 2018/12/13
N2 - Exploring small connected and induced subgraph patterns (CIS patterns, or graphlets) has recently attracted considerable attention. Despite recent efforts on computing how frequent a graphlet appears in a large graph (i.e., the total number of CISes isomorphic to the graphlet), little effort has been made to characterize a node’s graphlet orbit degree, i.e., the number of CISes isomorphic to the graphlet that touch the node at a particular orbit, which is an important fine-grained metric for analyzing complex networks such as learning functions/roles of nodes in social and biological networks. Like global graphlet counting, it is computationally intensive to compute node orbit degrees for a large graph. Furthermore, previous methods of computing global graphlet counts are not suited to solve this problem. In this paper, we propose a novel sampling method SNOD to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of our estimates. To the best of our knowledge, we are the first to study this problem and give a fast scalable solution. We conduct experiments on a variety of real-world datasets and demonstrate that our method SNOD is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit degrees for graphs with millions of edges.
AB - Exploring small connected and induced subgraph patterns (CIS patterns, or graphlets) has recently attracted considerable attention. Despite recent efforts on computing how frequent a graphlet appears in a large graph (i.e., the total number of CISes isomorphic to the graphlet), little effort has been made to characterize a node’s graphlet orbit degree, i.e., the number of CISes isomorphic to the graphlet that touch the node at a particular orbit, which is an important fine-grained metric for analyzing complex networks such as learning functions/roles of nodes in social and biological networks. Like global graphlet counting, it is computationally intensive to compute node orbit degrees for a large graph. Furthermore, previous methods of computing global graphlet counts are not suited to solve this problem. In this paper, we propose a novel sampling method SNOD to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of our estimates. To the best of our knowledge, we are the first to study this problem and give a fast scalable solution. We conduct experiments on a variety of real-world datasets and demonstrate that our method SNOD is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit degrees for graphs with millions of edges.
UR - http://hdl.handle.net/10754/630298
UR - http://link.springer.com/article/10.1007/s10115-018-1301-z
UR - http://www.scopus.com/inward/record.url?scp=85058628619&partnerID=8YFLogxK
U2 - 10.1007/s10115-018-1301-z
DO - 10.1007/s10115-018-1301-z
M3 - Article
SN - 0219-1377
VL - 61
SP - 301
EP - 326
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 1
ER -