Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization

IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics, - Jan 2015
Associated documents :  
The integration of multisensory information plays a crucial role in autonomous robotics. In this work, we investigate how robust multimodal representations can naturally develop in a self-organized manner from co-occurring multisensory inputs. We propose a hierarchical learning architecture with growing self-organizing neural networks for learning human actions from audiovisual inputs. Associative links between unimodal representations are incrementally learned by a semi-supervised algorithm with bidirectional connectivity that takes into account inherent spatiotemporal dynamics of the input. Experiments on a dataset of 10 full-body actions show that our architecture is able to learn action-word mappings without the need of segmenting training samples for ground-truth labelling. Instead, multimodal representations of actions are obtained using the co-activation of action features from video sequences and labels from automatic speech recognition. Promising experimental results encourage the extension of our architecture in several directions.

 

@InProceedings{PWW15a,
 	 author =  {Parisi, German I. and Weber, Cornelius and Wermter, Stefan},
 	 title = {Towards Emerging Multimodal Cognitive Representations from Neural Self-Organization},
 	 booktitle = {IEEE-RAS International Conference on Humanoid Robots (Humanoids), Workshop on Towards Intelligent Social Robots: Current Advances in Cognitive Robotics},
 	 journal = {None},
 	 editors = {}
 	 number = {}
 	 volume = {}
 	 pages = {}
 	 year = {2015},
 	 month = {Jan},
 	 publisher = {IEEE},
 	 doi = {}
 }