TY - GEN
T1 - TRIP: An interactive retrieving-inferring data imputation approach
AU - Li, Zhixu
AU - Qin, Lu
AU - Cheng, Hong
AU - Zhang, Xiangliang
AU - Zhou, Xiaofang
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2016/6/25
Y1 - 2016/6/25
N2 - Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.
AB - Data imputation aims at filling in missing attribute values in databases. Existing imputation approaches to nonquantitive string data can be roughly put into two categories: (1) inferring-based approaches [2], and (2) retrieving-based approaches [1]. Specifically, the inferring-based approaches find substitutes or estimations for the missing ones from the complete part of the data set. However, they typically fall short in filling in unique missing attribute values which do not exist in the complete part of the data set [1]. The retrieving-based approaches resort to external resources for help by formulating proper web search queries to retrieve web pages containing the missing values from the Web, and then extracting the missing values from the retrieved web pages [1]. This webbased retrieving approach reaches a high imputation precision and recall, but on the other hand, issues a large number of web search queries, which brings a large overhead [1]. © 2016 IEEE.
UR - http://hdl.handle.net/10754/621293
UR - http://ieeexplore.ieee.org/document/7498375/
UR - http://www.scopus.com/inward/record.url?scp=84980325571&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2016.7498375
DO - 10.1109/ICDE.2016.7498375
M3 - Conference contribution
SN - 9781509020201
SP - 1462
EP - 1463
BT - 2016 IEEE 32nd International Conference on Data Engineering (ICDE)
PB - Institute of Electrical and Electronics Engineers (IEEE)
ER -