TY - JOUR
T1 - Incremental Frequent Subgraph Mining on Large Evolving Graphs
AU - Abdelhamid, Ehab
AU - Canim, Mustafa
AU - Sadoghi, Mohammad
AU - Bhatta, Bishwaranjan
AU - Chang, Yuan-Chi
AU - Kalnis, Panos
N1 - KAUST Repository Item: Exported on 2020-04-23
PY - 2017/8/22
Y1 - 2017/8/22
N2 - Frequent subgraph mining is a core graph operation used in many domains, such as graph data management and knowledge exploration, bioinformatics and security. Most existing techniques target static graphs. However, modern applications, such as social networks, utilize large evolving graphs. Mining these graphs using existing techniques is infeasible, due to the high computational cost. In this paper, we propose IncGM+, a fast incremental approach for continuous frequent subgraph mining problem on a single large evolving graph. We adapt the notion of “fringe” to the graph context, that is the set of subgraphs on the border between frequent and infrequent subgraphs. IncGM+ maintains fringe subgraphs and exploits them to prune the search space. To boost the efficiency, we propose an efficient index structure to maintain selected embeddings with minimal memory overhead. These embeddings are utilized to avoid redundant expensive subgraph isomorphism operations. Moreover, the proposed system supports batch updates. Using large real-world graphs, we experimentally verify that IncGM+ outperforms existing methods by up to three orders of magnitude, scales to much larger graphs and consumes less memory.
AB - Frequent subgraph mining is a core graph operation used in many domains, such as graph data management and knowledge exploration, bioinformatics and security. Most existing techniques target static graphs. However, modern applications, such as social networks, utilize large evolving graphs. Mining these graphs using existing techniques is infeasible, due to the high computational cost. In this paper, we propose IncGM+, a fast incremental approach for continuous frequent subgraph mining problem on a single large evolving graph. We adapt the notion of “fringe” to the graph context, that is the set of subgraphs on the border between frequent and infrequent subgraphs. IncGM+ maintains fringe subgraphs and exploits them to prune the search space. To boost the efficiency, we propose an efficient index structure to maintain selected embeddings with minimal memory overhead. These embeddings are utilized to avoid redundant expensive subgraph isomorphism operations. Moreover, the proposed system supports batch updates. Using large real-world graphs, we experimentally verify that IncGM+ outperforms existing methods by up to three orders of magnitude, scales to much larger graphs and consumes less memory.
UR - http://hdl.handle.net/10754/625837
UR - http://ieeexplore.ieee.org/document/8014497/
UR - http://www.scopus.com/inward/record.url?scp=85028516257&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2017.2743075
DO - 10.1109/TKDE.2017.2743075
M3 - Article
SN - 1041-4347
VL - 29
SP - 2710
EP - 2723
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 12
ER -