TU Darmstadt / ULB / TUbiblio

Experimental study of multimodal representations for Frame Identification - How to find the right multimodal representations for this task?

Botschen, Teresa and Mousselly-Sergieh, Hatem and Gurevych, Iryna (2017):
Experimental study of multimodal representations for Frame Identification - How to find the right multimodal representations for this task?
In: Language-Learning-Logic Workshop (3L 2017), London, UK, [Conference or Workshop Item]

Abstract

Frame Identification (FrameId) is the first step in FrameNet Semantic Role Labeling where the correct frame is assigned to the predicate of a sentence. An automatic FrameId system takes the sentence and the predicate as input and predicts the correct frame. Current state-of-the-art FrameId systems are based on pretrained distributed word representations. For a wide range of tasks multimodal approaches are reported to be superior to unimodal approaches when textual embeddings are enriched with information from other modalities, for instance images. Regarding the task of FrameId, to the best of our knowledge, multimodal approaches have not yet been investigated and we think it deserves investigation due to the success of pretrained multimodal representations as input representations for other tasks. We want to find out whether representations that are grounded in images can help to improve the performance of our FrameId system. We report about our preliminary investigations with pretrained multimodal embeddings for FrameId.

Item Type: Conference or Workshop Item
Erschienen: 2017
Creators: Botschen, Teresa and Mousselly-Sergieh, Hatem and Gurevych, Iryna
Title: Experimental study of multimodal representations for Frame Identification - How to find the right multimodal representations for this task?
Language: English
Abstract:

Frame Identification (FrameId) is the first step in FrameNet Semantic Role Labeling where the correct frame is assigned to the predicate of a sentence. An automatic FrameId system takes the sentence and the predicate as input and predicts the correct frame. Current state-of-the-art FrameId systems are based on pretrained distributed word representations. For a wide range of tasks multimodal approaches are reported to be superior to unimodal approaches when textual embeddings are enriched with information from other modalities, for instance images. Regarding the task of FrameId, to the best of our knowledge, multimodal approaches have not yet been investigated and we think it deserves investigation due to the success of pretrained multimodal representations as input representations for other tasks. We want to find out whether representations that are grounded in images can help to improve the performance of our FrameId system. We report about our preliminary investigations with pretrained multimodal embeddings for FrameId.

Title of Book: Language-Learning-Logic Workshop (3L 2017)
Uncontrolled Keywords: AIPHES_area_c3
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Event Location: London, UK
Date Deposited: 14 Sep 2017 07:56
Identification Number: TUD-CS-2017-0246
Export:

Optionen (nur für Redakteure)

View Item View Item