Frequent Sets Mining in Data Stream Environments

Frequent Sets Mining in Data Stream Environments

Author: 
Dang, Xuan Hong
Place: 
Hershey
Publisher: 
IGI Global
Date published: 
2008
Responsibility: 
Ng, Wee-Keong, jt.author
Ong, Kok-Leong, jt.author
Editor: 
Wang, John
Journal Title: 
Encyclopedia of Data Warehousing and Mining, Second Edition
Source: 
Encyclopedia of Data Warehousing and Mining, Second Edition
Subject: 
Abstract: 

In recent years, data streams have emerged as a new data type that has attracted much attention from the data mining community. They arise naturally in a number of applications (Brian et al., 2002), including financial service (stock ticker, financial monitoring), sensor networks (earth sensing satellites, astronomic observations), web tracking and personalization (webclick streams). These stream applications share three distinguishing characteristics that limit the applicability of most traditional mining algorithms (Minos et al., 2002; Pedro and Geoff, 2001): (1) the continuous arrival rate of the stream is high and unpredictable; (2) the volume of data is unbounded, making it impractical to store the entire content of the stream; (3) in terms of practical applicability, stream mining results are often expected to be closely approximated the exact results as well as to be available at any time. Consequently, the main challenge in mining data streams is to develop effective algorithms that support the processing of stream data in one-pass manner (preferably on-line) whilst operating under system resources limitations (e.g., memory space, CPU cycles or bandwidth). This chapter discusses the above challenge in the context of finding frequent sets from transactional data streams. The problems will be presented and some effective methods, both from deterministic and probabilistic approaches, are reviewed in details. The tradeoffs between memory space and accuracy of mining results are also discussed. Furthermore, the problems will be considered in three fundamental mining models for stream environments: landmark window, forgetful window and sliding window models.

CITATION: Dang, Xuan Hong. Frequent Sets Mining in Data Stream Environments edited by Wang, John . Hershey : IGI Global , 2008. Encyclopedia of Data Warehousing and Mining, Second Edition - Available at: https://library.au.int/frequent-sets-mining-data-stream-environments