TU Darmstadt / ULB / TUbiblio

From Local Patterns to Global Models: The LeGo Approach to Data Mining

Crémilleux, Bruno and Fürnkranz, Johannes and Knobbe, Arno J. and Scholz, Martin (2007):
From Local Patterns to Global Models: The LeGo Approach to Data Mining.
[Online-Edition: http://www.ke.informatik.tu-darmstadt.de/publications/report...],
[Report]

Abstract

In this paper we present LeGo, a generic framework that utilizes existing local pattern mining techniques for global modeling in a variety of diverse data mining tasks. In the spirit of well known KDD process models, our work identifies different phases within the data mining step, each of which is formulated in terms of different formal constraints. It starts with a phase of mining patterns that are individually promising. Later phases establish the context given by the global data mining task by selecting groups of diverse and highly informative patterns, which are finally combined to one or more global models that address the overall data mining task(s). The paper discusses the connection to various learning techniques, and illustrates that our framework is broad enough to cover and leverage frequent pattern mining, subgroup discovery, pattern teams, multi-view learning, and several other popular algorithms. The Safarii learning toolbox serves as a proof-of-concept of its high potential for practical data mining applications. Finally, we point out several challenging open research questions that naturally emerge in a constraint-based local-to-global pattern mining, selection, and combination framework.

Item Type: Report
Erschienen: 2007
Creators: Crémilleux, Bruno and Fürnkranz, Johannes and Knobbe, Arno J. and Scholz, Martin
Title: From Local Patterns to Global Models: The LeGo Approach to Data Mining
Language: English
Abstract:

In this paper we present LeGo, a generic framework that utilizes existing local pattern mining techniques for global modeling in a variety of diverse data mining tasks. In the spirit of well known KDD process models, our work identifies different phases within the data mining step, each of which is formulated in terms of different formal constraints. It starts with a phase of mining patterns that are individually promising. Later phases establish the context given by the global data mining task by selecting groups of diverse and highly informative patterns, which are finally combined to one or more global models that address the overall data mining task(s). The paper discusses the connection to various learning techniques, and illustrates that our framework is broad enough to cover and leverage frequent pattern mining, subgroup discovery, pattern teams, multi-view learning, and several other popular algorithms. The Safarii learning toolbox serves as a proof-of-concept of its high potential for practical data mining applications. Finally, we point out several challenging open research questions that naturally emerge in a constraint-based local-to-global pattern mining, selection, and combination framework.

Divisions: 20 Department of Computer Science
20 Department of Computer Science > Knowl­edge En­gi­neer­ing
Date Deposited: 24 Jun 2011 15:25
Official URL: http://www.ke.informatik.tu-darmstadt.de/publications/report...
Identification Number: TUD-KE-2007-06
Export:

Optionen (nur für Redakteure)

View Item View Item