TY - CHAP
T1 - Error-Bounded Approximation of Data Stream: Methods and Theories
AU - Xie, Qing
AU - Pang, Chaoyi
AU - Zhou, Xiaofang
AU - Zhang, Xiangliang
AU - Deng, Ke
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: This work is partially supported by Natural Science Foundation of China (Grant No. 61602353), Natural Science Foundation of Hubei Province (Grant No. 2017CFB505), and the Fundamental Research Funds for the Central Universities (Grant No. WUT:2017IVA053 and WUT:2017IVB028).
PY - 2018/7/29
Y1 - 2018/7/29
N2 - Since the development of sensor network and Internet of Things, the volume of data is rapidly increasing and the streaming data has attracted much attention recently. To efficiently process and explore data streams, the compact data representation is playing an important role, since the data approximations other than the original data items are usually applied in many stream mining tasks, such as clustering, classification, and correlation analysis. In this chapter, we focus on the maximum error-bounded approximation of data stream, which represents the streaming data with constrained approximation error on each data point. There are two criteria for the approximation solution: self-adaption over time for varied error bound and real-time processing. We reviewed the existing data approximation techniques and summarized some essential theories such as optimization guarantee. Two optimal linear-time algorithms are introduced to construct error-bounded piecewise linear representation for data stream. One generates the line segments by data convex analysis, and the other one is based on the transformed space, which can be extended to a general model. We theoretically analyzed and compared these two different spaces, and proved the theoretical equivalence between them, as well as the two algorithms.
AB - Since the development of sensor network and Internet of Things, the volume of data is rapidly increasing and the streaming data has attracted much attention recently. To efficiently process and explore data streams, the compact data representation is playing an important role, since the data approximations other than the original data items are usually applied in many stream mining tasks, such as clustering, classification, and correlation analysis. In this chapter, we focus on the maximum error-bounded approximation of data stream, which represents the streaming data with constrained approximation error on each data point. There are two criteria for the approximation solution: self-adaption over time for varied error bound and real-time processing. We reviewed the existing data approximation techniques and summarized some essential theories such as optimization guarantee. Two optimal linear-time algorithms are introduced to construct error-bounded piecewise linear representation for data stream. One generates the line segments by data convex analysis, and the other one is based on the transformed space, which can be extended to a general model. We theoretically analyzed and compared these two different spaces, and proved the theoretical equivalence between them, as well as the two algorithms.
UR - http://hdl.handle.net/10754/628270
UR - https://link.springer.com/chapter/10.1007%2F978-3-319-89803-2_5
U2 - 10.1007/978-3-319-89803-2_5
DO - 10.1007/978-3-319-89803-2_5
M3 - Chapter
SN - 9783319898025
SP - 93
EP - 122
BT - Learning from Data Streams in Evolving Environments
PB - Springer Nature
ER -