TY - JOUR
T1 - KDE-Track: An Efficient Dynamic Density Estimator for Data Streams
AU - Qahtan, Abdulhakim Ali Ali
AU - Wang, Suojin
AU - Zhang, Xiangliang
N1 - KAUST Repository Item: Exported on 2020-10-01
PY - 2016/11/8
Y1 - 2016/11/8
N2 - Recent developments in sensors, global positioning system devices and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.
AB - Recent developments in sensors, global positioning system devices and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.
UR - http://hdl.handle.net/10754/621861
UR - http://ieeexplore.ieee.org/document/7738463/
UR - http://www.scopus.com/inward/record.url?scp=85012300533&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2016.2626441
DO - 10.1109/TKDE.2016.2626441
M3 - Article
SN - 1041-4347
VL - 29
SP - 642
EP - 655
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 3
ER -