Generic Pattern Mining via Data Mining Template Library

Generic Pattern Mining via Data Mining Template Library Nilanjana De Feng Gao Paolo Palmerini Nagender Parimi Jeevan Pathuri Benjarath Phoophakdee Joe Urban Mohammed J. Zaki Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this paper we propose the Data Mining Template Library, a collection of generic containers and algorithms for data mining, as well as persistency and database management classes. DMTL provides a systematic solution to a whole class of common FPM tasks like itemset, sequence, tree and graph mining. DMTL is extensible, scalable, and high-performance for rapid response on massive datasets. A detailed set of experiments show that DMTL is competitive with special purpose algorithms designed for a particular pattern type, especially as database sizes increase. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY cs-04-01

Generic Pattern Mining via Data Mining Template Library

Nilanjana De

Feng Gao

Paolo Palmerini

Nagender Parimi

Jeevan Pathuri

Benjarath Phoophakdee

Joe Urban

Mohammed J. Zaki

Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this paper we propose the Data Mining Template Library, a collection of generic containers and algorithms for data mining, as well as persistency and database management classes. DMTL provides a systematic solution to a whole class of common FPM tasks like itemset, sequence, tree and graph mining. DMTL is extensible, scalable, and high-performance for rapid response on massive datasets. A detailed set of experiments show that DMTL is competitive with special purpose algorithms designed for a particular pattern type, especially as database sizes increase.

Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY

cs-04-01