Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization
IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics,
- Jan 2015
The integration of multisensory information plays a
crucial role in autonomous robotics. In this work, we investigate
how robust multimodal representations can naturally develop in
a self-organized manner from co-occurring multisensory inputs.
We propose a hierarchical learning architecture with growing
self-organizing neural networks for learning human actions from
audiovisual inputs. Associative links between unimodal representations are incrementally learned by a semi-supervised algorithm
with bidirectional connectivity that takes into account inherent
spatiotemporal dynamics of the input. Experiments on a dataset
of 10 full-body actions show that our architecture is able to
learn action-word mappings without the need of segmenting
training samples for ground-truth labelling. Instead, multimodal
representations of actions are obtained using the co-activation of
action features from video sequences and labels from automatic
speech recognition. Promising experimental results encourage the
extension of our architecture in several directions.
@InProceedings{PWW15a,
author = {Parisi, German I. and Weber, Cornelius and Wermter, Stefan},
title = {Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization},
booktitle = {IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics},
journal = {None},
editors = {}
number = {}
volume = {}
pages = {}
year = {2015},
month = {Jan},
publisher = {IEEE},
doi = {}
}