In this paper we present SPADE, a new algorithm for fast discovery of Sequential Patterns. The existing solutions to this problem make repeated database scans, and use complex hash structures which have poor locality. SPADE utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations. All sequences are discovered in only three database scans. Experiments show that SPADE outperforms the best previous algorithm by a factor of two, and by an order of magnitude with some pre-processed data. It also has linear scalability with respect to the number of input-sequences, and a number of other database parameters. Finally, we discuss how the results of sequence mining can be applied in a real application domain.
Keywords for this software
References in zbMATH (referenced in 89 articles , 1 standard article )
Showing results 81 to 89 of 89.
- Sun, Xingzhi; Orlowska, Maria E.; Zhou, Xiaofang: Finding event-oriented patterns in long temporal sequences (2003)
- Sy, Bon K.: Discovering association patterns based on mutual information (2003)
- Lin, Ming-Yen; Lee, Suh-Yin: Fast discovery of sequential patterns by memory indexing (2002)
- Lin, Ming-Yen; Lee, Suh-Yin; Wang, Sheng-Shun: DELISP: Efficient discovery of generalized sequential patterns by delimited pattern-growth technology (2002)
- Punin, John R.; Krishnamoorthy, Mukkai S.; Zaki, Mohammed J.: LOGML: Log markup language for web usage mining (2002)
- Berlekamp, Elwyn R.; Conway, John H.; Guy, Richard K.: Winning ways for your mathematical plays. Vol. 1. (2001)
- Höppner, Frank; Klawonn, Frank: Finding informative rules in interval sequences (2001)
- Lieverse, Paul; Van der Wolf, Pieter; Vissers, Kees; Deprettere, Ed: A methodology for architecture exploration of heterogeneous signal processing systems (2001)
- Zaki, Mohammed J.: SPADE: An efficient algorithm for mining frequent sequences (2001)