TY - GEN
T1 - DeepAM: Deep semantic address representation for address matching
AU - Shan, Shuangli
AU - Li, Zhixu
AU - Yang, Qiang
AU - Liu, An
AU - Xu, Jiajie
AU - Chen, Zhigang
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This research is partially supported by National Natural Science Foundation of China (Grant No. 61632016, 61572336, 61572335, 61772356), the Natural Science Research Project of Jiangsu Higher Education Institution (No. 17KJA520003, 18KJA520010), and the Open Program of Neusoft Corporation (No. SKLSAOP1801).
PY - 2019/7/18
Y1 - 2019/7/18
N2 - Address matching is a crucial task in various location-based businesses like take-out services and express delivery, which aims at identifying addresses referring to the same location in address databases. It is a challenging one due to various possible ways to express the address of a location, especially in Chinese. Traditional address matching approaches relying on string similarities and learning matching rules to identify addresses referring to the same location, could hardly solve the cases with redundant, incomplete or unusual expression of addresses. In this paper, we propose to map every address into a fixed-size vector in the same vector space using state-of-the-art deep sentence representation techniques and then measure the semantic similarity between addresses in this vector space. The attention mechanism is also applied to the model to highlight important features of addresses in their semantic representations. Last but not least, we novelly propose to get rich contexts for addresses from the web through web search engines, which could strongly enrich the semantic meaning of addresses that could be learned. Our empirical study conducted on two real-world address datasets demonstrates that our approach greatly improves both precision (up to 5%) and recall (up to 8%) of the state-of-the-art existing methods.
AB - Address matching is a crucial task in various location-based businesses like take-out services and express delivery, which aims at identifying addresses referring to the same location in address databases. It is a challenging one due to various possible ways to express the address of a location, especially in Chinese. Traditional address matching approaches relying on string similarities and learning matching rules to identify addresses referring to the same location, could hardly solve the cases with redundant, incomplete or unusual expression of addresses. In this paper, we propose to map every address into a fixed-size vector in the same vector space using state-of-the-art deep sentence representation techniques and then measure the semantic similarity between addresses in this vector space. The attention mechanism is also applied to the model to highlight important features of addresses in their semantic representations. Last but not least, we novelly propose to get rich contexts for addresses from the web through web search engines, which could strongly enrich the semantic meaning of addresses that could be learned. Our empirical study conducted on two real-world address datasets demonstrates that our approach greatly improves both precision (up to 5%) and recall (up to 8%) of the state-of-the-art existing methods.
UR - http://hdl.handle.net/10754/656842
UR - http://link.springer.com/10.1007/978-3-030-26072-9_4
UR - http://www.scopus.com/inward/record.url?scp=85070012035&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-26072-9_4
DO - 10.1007/978-3-030-26072-9_4
M3 - Conference contribution
SN - 9783030260712
SP - 45
EP - 60
BT - Web and Big Data
PB - Springer International Publishing
ER -