Sulzmann, Jan-Nikolas (2019)
Rule Learning: From Local Patterns to Global Models.
Technische Universität Darmstadt
Dissertation, Erstveröffentlichung
Kurzbeschreibung (Abstract)
In many areas of daily life (e.g. in e-commerce or social networks), massive amounts of data are collected and stored in databases (for future use). Even though the specific information contained in the collected data may already be interesting, more general insights into the data would be more useful. Clearly, a data analysis should aim for a discovery of such pieces of knowledge, but a human inspection becomes less and less feasible to do as the databases become more and more unmanageable. To this end, the KDD process (short for ``Knowledge Discovery in Databases'') provides the tools for a semi-automatic data analysis. Data mining, which is the main component of the KDD process, searches the explicit facts for regularities which represent pieces of knowledge. Usually, these regularities are formulated as local patterns which describe only local characteristics of the data or as global models which explain the whole data. In our work, we will concentrate on local patterns and global models that may be used to predict a feature of interest or class attribute for future and unknown data. Interestingly, predictive local patterns may be used to obtain global predictions in two ways. The integrative approach treats the local patterns as building blocks and builds with their help a global model. The decoding approach aggregates the predictions of the local patterns into a single global prediction. While both approaches are promising, the question, how local patterns may be employed for global modelling, has not been answered satisfactorily yet. To this end, we consider three important aspects of this question in this work.
The first aspect is, how may a set of local patterns be employed to obtain optimal global predictions. The LeGo framework (an acronym for ``from \textbf{l}ocal patt\textbf{e}rns to \textbf{g}lobal m\textbf{o}dels'') provides an approach to answer this question. It divides the data mining process into three subsequent steps: the local pattern discovery generates a set of local patterns, the pattern set discovery step selects a smaller subset from the set of local patterns, and the global modelling employs the reduced pattern set to build a global model. There are many methods available for each step. So, we employ a selection of methods for each step and evaluate their performances with respect to the first considered aspect empirically.
The second aspect is, how may a set of local patterns be utilised to obtain optimal class probabilities. Often class probabilities may be more useful than a simple prediction as they may be used as a confidence measure in the prediction (e.g. in voting schemes). We divide this aspect into two sub tasks: the probability estimation and the probability aggregation. The probability estimation calculates class probabilities given a single local pattern. For this task, we consider basic probability estimation methods and shrinkage which is a technique to smooth the basic probability estimations. Furthermore, we examine the effect of the local pattern discovery on the quality of the probability estimation. The probability aggregation decodes the probability estimations of multiple patterns into a single probability estimation. For this purpose, we evaluate the performances of a selection of aggregation methods.
The third aspect is, how may a set of local patterns be transformed into a compact and understandable model. Usually, local pattern sets are hard to interpret and their utilisation for prediction necessitates additional efforts (e.g. voting schemes). These issues may be solved at once if the local patterns are employed to obtain a global model. To this end, we introduce rule stacking, which is a novel approach for global modelling. Rule stacking advances the standard stacking approach in two aspects: the meta data generation and the additional retransformation of the meta model. In this way, we obtain a compressed and interpretable global model that is directly applicable to future data.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2019 | ||||
Autor(en): | Sulzmann, Jan-Nikolas | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | Rule Learning: From Local Patterns to Global Models | ||||
Sprache: | Englisch | ||||
Referenten: | Fürnkranz, Prof. Dr. Johannes ; Kramer, Prof. Dr. Stefan | ||||
Publikationsjahr: | 17 Januar 2019 | ||||
Ort: | Darmstadt | ||||
Datum der mündlichen Prüfung: | 20 April 2018 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/7387 | ||||
Kurzbeschreibung (Abstract): | In many areas of daily life (e.g. in e-commerce or social networks), massive amounts of data are collected and stored in databases (for future use). Even though the specific information contained in the collected data may already be interesting, more general insights into the data would be more useful. Clearly, a data analysis should aim for a discovery of such pieces of knowledge, but a human inspection becomes less and less feasible to do as the databases become more and more unmanageable. To this end, the KDD process (short for ``Knowledge Discovery in Databases'') provides the tools for a semi-automatic data analysis. Data mining, which is the main component of the KDD process, searches the explicit facts for regularities which represent pieces of knowledge. Usually, these regularities are formulated as local patterns which describe only local characteristics of the data or as global models which explain the whole data. In our work, we will concentrate on local patterns and global models that may be used to predict a feature of interest or class attribute for future and unknown data. Interestingly, predictive local patterns may be used to obtain global predictions in two ways. The integrative approach treats the local patterns as building blocks and builds with their help a global model. The decoding approach aggregates the predictions of the local patterns into a single global prediction. While both approaches are promising, the question, how local patterns may be employed for global modelling, has not been answered satisfactorily yet. To this end, we consider three important aspects of this question in this work. The first aspect is, how may a set of local patterns be employed to obtain optimal global predictions. The LeGo framework (an acronym for ``from \textbf{l}ocal patt\textbf{e}rns to \textbf{g}lobal m\textbf{o}dels'') provides an approach to answer this question. It divides the data mining process into three subsequent steps: the local pattern discovery generates a set of local patterns, the pattern set discovery step selects a smaller subset from the set of local patterns, and the global modelling employs the reduced pattern set to build a global model. There are many methods available for each step. So, we employ a selection of methods for each step and evaluate their performances with respect to the first considered aspect empirically. The second aspect is, how may a set of local patterns be utilised to obtain optimal class probabilities. Often class probabilities may be more useful than a simple prediction as they may be used as a confidence measure in the prediction (e.g. in voting schemes). We divide this aspect into two sub tasks: the probability estimation and the probability aggregation. The probability estimation calculates class probabilities given a single local pattern. For this task, we consider basic probability estimation methods and shrinkage which is a technique to smooth the basic probability estimations. Furthermore, we examine the effect of the local pattern discovery on the quality of the probability estimation. The probability aggregation decodes the probability estimations of multiple patterns into a single probability estimation. For this purpose, we evaluate the performances of a selection of aggregation methods. The third aspect is, how may a set of local patterns be transformed into a compact and understandable model. Usually, local pattern sets are hard to interpret and their utilisation for prediction necessitates additional efforts (e.g. voting schemes). These issues may be solved at once if the local patterns are employed to obtain a global model. To this end, we introduce rule stacking, which is a novel approach for global modelling. Rule stacking advances the standard stacking approach in two aspects: the meta data generation and the additional retransformation of the meta model. In this way, we obtain a compressed and interpretable global model that is directly applicable to future data. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
URN: | urn:nbn:de:tuda-tuprints-73879 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Knowledge Engineering |
||||
Hinterlegungsdatum: | 27 Jan 2019 20:55 | ||||
Letzte Änderung: | 27 Jan 2019 20:55 | ||||
PPN: | |||||
Referenten: | Fürnkranz, Prof. Dr. Johannes ; Kramer, Prof. Dr. Stefan | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 20 April 2018 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |