TU Darmstadt / ULB / TUbiblio

A Bayesian Approach to Policy Recognition and State Representation Learning

Šošić, A. ; Zoubir, A. M. ; Koeppl, H. (2018)
A Bayesian Approach to Policy Recognition and State Representation Learning.
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 40 (6)
doi: 10.1109/TPAMI.2017.2711024
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used, e.g., for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g., they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

Typ des Eintrags: Artikel
Erschienen: 2018
Autor(en): Šošić, A. ; Zoubir, A. M. ; Koeppl, H.
Art des Eintrags: Bibliographie
Titel: A Bayesian Approach to Policy Recognition and State Representation Learning
Sprache: Englisch
Publikationsjahr: 1 Juni 2018
Titel der Zeitschrift, Zeitung oder Schriftenreihe: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jahrgang/Volume einer Zeitschrift: 40
(Heft-)Nummer: 6
DOI: 10.1109/TPAMI.2017.2711024
URL / URN: https://ieeexplore.ieee.org/document/7937852
Kurzbeschreibung (Abstract):

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used, e.g., for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g., they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Bioinspirierte Kommunikationssysteme
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Nachrichtentechnik > Signalverarbeitung
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
Zentrale Einrichtungen
Zentrale Einrichtungen > Centre for Cognitive Science (CCS)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C3: Inhaltszentrische Sicht
Hinterlegungsdatum: 03 Mai 2016 16:59
Letzte Änderung: 15 Dez 2022 09:22
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen