Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization

IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics, - Jan 2015
Associated documents :  
The integration of multisensory information plays a crucial role in autonomous robotics. In this work, we investigate how robust multimodal representations can naturally develop in a self-organized manner from co-occurring multisensory inputs. We propose a hierarchical learning architecture with growing self-organizing neural networks for learning human actions from audiovisual inputs. Associative links between unimodal representations are incrementally learned by a semi-supervised algorithm with bidirectional connectivity that takes into account inherent spatiotemporal dynamics of the input. Experiments on a dataset of 10 full-body actions show that our architecture is able to learn action-word mappings without the need of segmenting training samples for ground-truth labelling. Instead, multimodal representations of actions are obtained using the co-activation of action features from video sequences and labels from automatic speech recognition. Promising experimental results encourage the extension of our architecture in several directions.

 

@InProceedings{PWW15a, 
 	 author =  {Parisi, German I. and Weber, Cornelius and Wermter, Stefan},  
 	 title = {Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization}, 
 	 booktitle = {IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics},
 	 editors = {},
 	 number = {},
 	 volume = {},
 	 pages = {},
 	 year = {2015},
 	 month = {Jan},
 	 publisher = {IEEE},
 	 doi = {}, 
 }