Robot Sound Localisation Neural Network Inspired by the Inferior Colliculus

Jindong Liu , David Perez-Gonzales , Adrian Rees , Harry Erwin , Stefan Wermter

- Aug 2008

Associated documents :

Modelling sound source localisation (SSL) for mobile robots is important for the development of human-machine interfaces and sensor-motor robot control. Present technologies for SSL, such as the cross-correlation method or the multiple-microphone array schema, have limitations when the robot operates in a noisy environment. The mammalian auditory system provides an efficient SSL system by using a pair of microphones that feed information into the system. The inferior colliculus (IC), the midbrain nucleus of the auditory pathway is a centre of convergence and integration of several brainstem pathways, including those devoted to the sound localisation. In the MiCRAM project we are studying the functional structure of the IC and applying it to a mobile robot in order to improve its SSL performance in a cluttered environment. Our system models two ascending pathways to the IC using a spiking neural network: the ITD (Interaural Time Difference) and ILD (Interaural Level Difference) pathways. The calculations resulting from ITD and ILD are finally merged in an IC model to take advantage of ITD on the low frequency spectrum and of ILD on the high frequency spectrum. The structure and computational procedures of the system are briefly introduced as follows: To simulate the cochlea, the sound from two microphones placed on the sides of the robot head is filtered by a Gammatone filterbank and split into a number of frequency channels. Each channel is then encoded into a spike train. The mechanism of encoding is based on biological evidence of phase-locked spikes feeding the MSO (medial superior olive). In modelling the ITD pathway, we assume that no delay line exists in the ITD processing for the ipsilateral ear, while multiple delay lines exist for the contralateral ear. The phase-locked spike trains from both ears follow the delay line structure, and feed into a spiking neural network which uses a leaky integrate-and-fire model to simulate the sustained-regular cells that simulate ITD processing. ILD spikes are calculated directly from the Gammatone filterbank output based on the logarithmic ratio of sound level at both ears. The ITD and ILD spikes are counted along both frequency and ITD/ILD channels. The azimuth angle of the sound source can be estimated using Bayes' theorem or Dempster-Shafer theory. Experimental results on pure tone, click, noise and voice show that our model performs sound localisation that approaches biological performance. Our approach demonstrates a practical application of biologically based sound localisation for robots.

@Misc{LPREW08, 
 	 author =  {Liu, Jindong and Perez-Gonzales, David and Rees, Adrian and Erwin, Harry and Wermter, Stefan},  
 	 title = {Robot Sound Localisation Neural Network Inspired by the Inferior Colliculus}, 
 	 year = {2008},
 	 month = {Aug},
 	 doi = {}, 
 }