Neural Network-based Document Clustering using WordNet Ontologies
International Journal of Hybrid Intelligent Systems,
Volume 1,
pages 127--142,
- 2004
Three novel text vector representation approaches for neural network based document
clustering are proposed. The first is the extended significance vector model (ESVM), the second is the
hypernym significance vector model (HSVM) and the last is the hybrid vector space model (HyM).
ESVM extracts the relationship between words and their preferred classified labels. HSVM exploits a
semantic relationship from the WordNet ontology. A more general term, the hypernym, substitutes for
terms with similar concepts. This hypernym semantic relationship supplements the neural model in
document clustering. HyM is a combination of a TFxIDF vector and a hypernym significance vector,
which combines the advantages and reduces the disadvantages from both unsupervised and supervised
vector representation approaches. According to our experiments, the self-organising map (SOM)
model based on the HyM text vector representation approach is able to improve classification accuracy
and to reduce the average quantization error (AQE) on 10,000 full-text articles.
@Article{HW04, author = {Hung, Chihli and Wermter, Stefan}, title = {Neural Network-based Document Clustering using WordNet Ontologies}, journal = {International Journal of Hybrid Intelligent Systems}, number = {}, volume = {1}, pages = {127--142}, year = {2004}, month = {}, publisher = {IOS Press}, doi = {}, }