Error-Bounded Approximation of Data Stream: Methods and Theories

Qing Xie, Chaoyi Pang, Xiaofang Zhou, Xiangliang Zhang, Ke Deng

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Since the development of sensor network and Internet of Things, the volume of data is rapidly increasing and the streaming data has attracted much attention recently. To efficiently process and explore data streams, the compact data representation is playing an important role, since the data approximations other than the original data items are usually applied in many stream mining tasks, such as clustering, classification, and correlation analysis. In this chapter, we focus on the maximum error-bounded approximation of data stream, which represents the streaming data with constrained approximation error on each data point. There are two criteria for the approximation solution: self-adaption over time for varied error bound and real-time processing. We reviewed the existing data approximation techniques and summarized some essential theories such as optimization guarantee. Two optimal linear-time algorithms are introduced to construct error-bounded piecewise linear representation for data stream. One generates the line segments by data convex analysis, and the other one is based on the transformed space, which can be extended to a general model. We theoretically analyzed and compared these two different spaces, and proved the theoretical equivalence between them, as well as the two algorithms.
Original languageEnglish (US)
Title of host publicationLearning from Data Streams in Evolving Environments
PublisherSpringer Nature
Pages93-122
Number of pages30
ISBN (Print)9783319898025
DOIs
StatePublished - Jul 29 2018

Fingerprint

Dive into the research topics of 'Error-Bounded Approximation of Data Stream: Methods and Theories'. Together they form a unique fingerprint.

Cite this