Semantic subspace learning for text classification using hybrid intelligent techniques

Michael Philip Oakes , Stefan Wermter , Nandita Tripathi
International Journal of Hybrid Intelligent Systems, Volume 8, Number 2, pages 99--114, doi: 10.3233/HIS-2011-0137 - May 2011
Associated documents :  
A vast data repository such as the web contains many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each of these domains constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. In this paper we present a novel hybrid parallel architecture using different types of classifiers trained on different subspaces to improve text classification within these subspaces. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using maximum significance values. We use the conditional significance vector representation which enhances the distinction between classes within the subspace. We further compare the performance of our hybrid architecture with that of a single classifier – full data space learning system and show that it outperforms the single classifier system by a large margin when tested with a variety of hybrid combinations on two different corpora. Our results show that subspace classification accuracy is boosted and learning time reduced significantly with this new hybrid architecture.

 

@Article{OWT11b, 
 	 author =  {Oakes, Michael Philip and Wermter, Stefan and Tripathi, Nandita},  
 	 title = {Semantic subspace learning for text classification using hybrid intelligent techniques}, 
 	 journal = {International Journal of Hybrid Intelligent Systems},
 	 number = {2},
 	 volume = {8},
 	 pages = {99--114},
 	 year = {2011},
 	 month = {May},
 	 publisher = {IOS Press Amsterdam},
 	 doi = {10.3233/HIS-2011-0137}, 
 }