TY - GEN
T1 - Pivoted subgraph isomorphism
T2 - 22nd International Conference on Extending Database Technology, EDBT 2019
AU - Abdelhamid, Ehab
AU - Abdelaziz, Ibrahim
AU - Khayyat, Zuhair
AU - Kalnis, Panos
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019
Y1 - 2019
N2 - Given query graph Q with pivot node v, Pivoted Subgraph Isomorphism (PSI) finds all distinct nodes in an input graph G that correspond to v in all matches of Q in G. PSI is a core operation in many applications such as frequent subgraph mining, protein functionality prediction and in-network node similarity. Existing applications implement PSI as an instance of the general subgraph isomorphism algorithm, which is expensive and under-optimized. As a result, these applications perform poorly and do not scale to large graphs. In this paper, we propose SmartPSI; a system to efficiently evaluate PSI queries. We develop two algorithms, called optimistic and pessimistic, each tailored for different instances of the problem. Based on machine learning, SmartPSI builds on-the-fly a classifier to decide which of the two algorithms is appropriate for evaluating each graph node. SmartPSI also implements a machine learning-based optimizer to generate low-cost execution plans. Our experimental evaluation with large-scale real graphs shows that SmartPSI outperforms existing approaches by up to two orders of magnitude and is able to process significantly larger graphs. Moreover, SmartPSI is shown to achieve up to 6 times performance improvement when it replaces standard subgraph isomorphism in the state-of-the-art distributed frequent subgraph mining system.
AB - Given query graph Q with pivot node v, Pivoted Subgraph Isomorphism (PSI) finds all distinct nodes in an input graph G that correspond to v in all matches of Q in G. PSI is a core operation in many applications such as frequent subgraph mining, protein functionality prediction and in-network node similarity. Existing applications implement PSI as an instance of the general subgraph isomorphism algorithm, which is expensive and under-optimized. As a result, these applications perform poorly and do not scale to large graphs. In this paper, we propose SmartPSI; a system to efficiently evaluate PSI queries. We develop two algorithms, called optimistic and pessimistic, each tailored for different instances of the problem. Based on machine learning, SmartPSI builds on-the-fly a classifier to decide which of the two algorithms is appropriate for evaluating each graph node. SmartPSI also implements a machine learning-based optimizer to generate low-cost execution plans. Our experimental evaluation with large-scale real graphs shows that SmartPSI outperforms existing approaches by up to two orders of magnitude and is able to process significantly larger graphs. Moreover, SmartPSI is shown to achieve up to 6 times performance improvement when it replaces standard subgraph isomorphism in the state-of-the-art distributed frequent subgraph mining system.
UR - http://www.scopus.com/inward/record.url?scp=85064887321&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2019.32
DO - 10.5441/002/edbt.2019.32
M3 - Conference contribution
AN - SCOPUS:85064887321
T3 - Advances in Database Technology - EDBT
SP - 361
EP - 372
BT - Advances in Database Technology - EDBT 2019
A2 - Binnig, Carsten
A2 - Fundulaki, Irini
A2 - Reinwald, Berthold
A2 - Kaoudi, Zoi
A2 - Galhardas, Helena
A2 - Herschel, Melanie
PB - OpenProceedings.org, University of Konstanz, University Library
Y2 - 26 March 2019 through 29 March 2019
ER -