A Time-Based Self-Organising Model for Document Clustering

Chihli Hung , Stefan Wermter
Proceedings of the International Joint Conference on Neural Networks pages 17--23, - Jul 2004
Associated documents :  
Most current approaches for document clustering do not consider the non-stationary feature of real world document collection. In this paper, in a non-stationary environment, we propose a new self-organising model, namely the dynamic adaptive self-organising hybrid (DASH) model. The DASH model runs continuously since the new document set is formed consecutively for training while the old document set is still at the training stage. Knowledge learned from the old data set is adjusted to reflect the new data set and therefore document clusters are up-to-date. We test the performance of our model using the Reuters-RCV1 news corpus and obtain promising results based on the criteria of classification accuracy and average quantization error.

 

@InProceedings{HW04a, 
 	 author =  {Hung, Chihli and Wermter, Stefan},  
 	 title = {A Time-Based Self-Organising Model for Document Clustering}, 
 	 booktitle = {Proceedings of the International Joint Conference on Neural Networks},
 	 number = {},
 	 volume = {},
 	 pages = {17--23},
 	 year = {2004},
 	 month = {Jul},
 	 publisher = {IEEE},
 	 doi = {}, 
 }