Hybrid classifiers based on semantic data subspaces for two-level text categorization
International Journal of Hybrid Intelligent Systems,
Volume 10,
Number 1,
pages 33--41,
doi: 10.3233/HIS-130163
- Mar 2013
Many organizations are nowadays keeping their data in the form of multi-level categories for easier manageability.
An example of this is the Reuters Corpus which has news items categorized in a hierarchy of up to five levels. The volume
and diversity of documents available in such category hierarchies is also increasing daily. As such, it becomes difficult for a
traditional classifier to efficiently handle multi-level categorization of such a varied document space. In this paper, we present
hybrid classifiers involving various two-classifier and four-classifier combinations for two-level text categorization. We show that
the classification accuracy of the hybrid combination is better than the classification accuracies of all the corresponding single
classifiers. The constituent classifiers of the hybrid combination operate on different subspaces obtained by semantic separation
of data. Our experiments show that dividing a document space into different semantic subspaces increases the efficiency of such
hybrid classifier combinations. We further show that hierarchies with a larger number of categories at the first level benefit more
from this general hybrid architecture.
@Article{OW13, author = {Oakes, Michael Philip and Wermter, Stefan}, title = {Hybrid classifiers based on semantic data subspaces for two-level text categorization}, journal = {International Journal of Hybrid Intelligent Systems}, number = {1}, volume = {10}, pages = {33--41}, year = {2013}, month = {Mar}, publisher = {IOS Press Amsterdam}, doi = {10.3233/HIS-130163}, }