Stark, Michael (2010)
On Knowledge Transfer in Object Class Recognition.
Technische Universität Darmstadt
Dissertation, Erstveröffentlichung
Kurzbeschreibung (Abstract)
In recent years, impressive results have been reported for the recognition of individual object classes, based on the combination of robust visual features with powerful statistical learning techniques. As a result, the simultaneous recognition of many object classes is coming into focus, posing challenges with respect to both model complexity and the need for increasing amounts of training data. Reusing once acquired information in the context of related recognition tasks, effectively transferring knowledge between object classes, has been identified as a promising route towards scalable recognition. Besides increasing scalability, knowledge transfer has been shown to enable novel tasks, such as the recognition of object classes for which no training data are available, termed zero-shot recognition. In this case, missing training data is compensated by exploiting additional, complementary sources of knowledge, such as linguistic knowledge bases. Based on these encouraging prospects, this thesis explores four different dimensions of knowledge transfer in object class recognition. First, we investigate the role of visual features as a low level representation of transferable knowledge. Based on an extensive evaluation of existing state-of-the-art local feature detectors and descriptors, we identify shape-based features in connection with powerful spatial models as a promising candidate representation. Building upon this result, we further introduce a novel flavor of local shape-based features, as well as a generic appearance descriptor based on shading artifacts. Second, we highlight the connection between knowledge transfer and generalization across basic-level object categories, by recognizing objects according to potential functions or affordances. In particular, we demonstrate that visually distinct hints on affordances, modeled as collections of local shape features, can be shared and hence transfered between object classes. Third, we design shape-based object class models for knowledge transfer, representing object classes as spatially constrained assemblies of parts, including pair-wise symmetry relations. These models are both compositional and incremental, allowing for knowledge transfer either on the level of entire object class models or restricted to a subset of model components. While knowledge transfer in these models has to be guided by manual supervision, we demonstrate the benefit of knowledge transfer for object class recognition when learning from scarce training data. And fourth, we demonstrate that exploiting additional sources of knowledge besides real world training images can aid object class recognition, effectively transferring knowledge between different representations. In particular, we use linguistic knowledge bases in connection with semantic relatedness measures to automatically determine potential sources and targets of knowledge transfer for zero-shot recognition, and show the successful learning of shape-based object class models from collections of 3D computer aided design (CAD) models, not using any real world training images of the object class of interest. In summary, this thesis achieves encouraging results with respect to four different dimensions of knowledge transfer, namely, specialized visual feature representations, generalization across basic-level categories, compositional object class models, and the exploitation of additional sources of knowledge, confirming the benefits of knowledge transfer. As a side effect, we are able to obtain object class recognition results often superior to or en par with prior work.
Typ des Eintrags: | Dissertation | ||||
---|---|---|---|---|---|
Erschienen: | 2010 | ||||
Autor(en): | Stark, Michael | ||||
Art des Eintrags: | Erstveröffentlichung | ||||
Titel: | On Knowledge Transfer in Object Class Recognition | ||||
Sprache: | Englisch | ||||
Referenten: | Goesele, Prof. Dr.- Michael ; Hebert, Prof. Ph. Martial ; Schiele, Prof. Dr. Bernt | ||||
Publikationsjahr: | 30 September 2010 | ||||
Ort: | Darmstadt | ||||
Datum der mündlichen Prüfung: | 23 September 2010 | ||||
URL / URN: | https://tuprints.ulb.tu-darmstadt.de/2298 | ||||
Kurzbeschreibung (Abstract): | In recent years, impressive results have been reported for the recognition of individual object classes, based on the combination of robust visual features with powerful statistical learning techniques. As a result, the simultaneous recognition of many object classes is coming into focus, posing challenges with respect to both model complexity and the need for increasing amounts of training data. Reusing once acquired information in the context of related recognition tasks, effectively transferring knowledge between object classes, has been identified as a promising route towards scalable recognition. Besides increasing scalability, knowledge transfer has been shown to enable novel tasks, such as the recognition of object classes for which no training data are available, termed zero-shot recognition. In this case, missing training data is compensated by exploiting additional, complementary sources of knowledge, such as linguistic knowledge bases. Based on these encouraging prospects, this thesis explores four different dimensions of knowledge transfer in object class recognition. First, we investigate the role of visual features as a low level representation of transferable knowledge. Based on an extensive evaluation of existing state-of-the-art local feature detectors and descriptors, we identify shape-based features in connection with powerful spatial models as a promising candidate representation. Building upon this result, we further introduce a novel flavor of local shape-based features, as well as a generic appearance descriptor based on shading artifacts. Second, we highlight the connection between knowledge transfer and generalization across basic-level object categories, by recognizing objects according to potential functions or affordances. In particular, we demonstrate that visually distinct hints on affordances, modeled as collections of local shape features, can be shared and hence transfered between object classes. Third, we design shape-based object class models for knowledge transfer, representing object classes as spatially constrained assemblies of parts, including pair-wise symmetry relations. These models are both compositional and incremental, allowing for knowledge transfer either on the level of entire object class models or restricted to a subset of model components. While knowledge transfer in these models has to be guided by manual supervision, we demonstrate the benefit of knowledge transfer for object class recognition when learning from scarce training data. And fourth, we demonstrate that exploiting additional sources of knowledge besides real world training images can aid object class recognition, effectively transferring knowledge between different representations. In particular, we use linguistic knowledge bases in connection with semantic relatedness measures to automatically determine potential sources and targets of knowledge transfer for zero-shot recognition, and show the successful learning of shape-based object class models from collections of 3D computer aided design (CAD) models, not using any real world training images of the object class of interest. In summary, this thesis achieves encouraging results with respect to four different dimensions of knowledge transfer, namely, specialized visual feature representations, generalization across basic-level categories, compositional object class models, and the exploitation of additional sources of knowledge, confirming the benefits of knowledge transfer. As a side effect, we are able to obtain object class recognition results often superior to or en par with prior work. |
||||
Alternatives oder übersetztes Abstract: |
|
||||
URN: | urn:nbn:de:tuda-tuprints-22989 | ||||
Sachgruppe der Dewey Dezimalklassifikatin (DDC): | 000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik | ||||
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Graphics, Capture and Massively Parallel Computing 20 Fachbereich Informatik > Multimodale Interaktive Systeme |
||||
Hinterlegungsdatum: | 05 Okt 2010 12:15 | ||||
Letzte Änderung: | 26 Mai 2023 09:06 | ||||
PPN: | |||||
Referenten: | Goesele, Prof. Dr.- Michael ; Hebert, Prof. Ph. Martial ; Schiele, Prof. Dr. Bernt | ||||
Datum der mündlichen Prüfung / Verteidigung / mdl. Prüfung: | 23 September 2010 | ||||
Export: | |||||
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |