Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization
IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics,
- Jan 2015
The integration of multisensory information plays a
crucial role in autonomous robotics. In this work, we investigate
how robust multimodal representations can naturally develop in
a self-organized manner from co-occurring multisensory inputs.
We propose a hierarchical learning architecture with growing
self-organizing neural networks for learning human actions from
audiovisual inputs. Associative links between unimodal representations are incrementally learned by a semi-supervised algorithm
with bidirectional connectivity that takes into account inherent
spatiotemporal dynamics of the input. Experiments on a dataset
of 10 full-body actions show that our architecture is able to
learn action-word mappings without the need of segmenting
training samples for ground-truth labelling. Instead, multimodal
representations of actions are obtained using the co-activation of
action features from video sequences and labels from automatic
speech recognition. Promising experimental results encourage the
extension of our architecture in several directions.
@InProceedings{PWW15a, author = {Parisi, German I. and Weber, Cornelius and Wermter, Stefan}, title = {Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization}, booktitle = {IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics}, editors = {}, number = {}, volume = {}, pages = {}, year = {2015}, month = {Jan}, publisher = {IEEE}, doi = {}, }