论文部分内容阅读
Ant-based text clustering is a promising technique that has attracted great research attention. This paper attempts to improve the standard ant-based text-clustering algorithm in two dimensions. On one hand, the ontology-based semantic similarity measure is used in conjunction with the traditional vector-space-model-based measure to provide more accurate assessment of the similarity between documents. On the other, the ant behavior model is modified to pursue better algorithmic performance.Especially, the ant movement rule is adjusted so as to direct a laden ant toward a dense area of the same type of items as the ants carrying item, and to direct an unladen ant toward an area that contains an item dissimilar with the surrounding items within its Moore neighborhood. Using WordNet as the base ontology for assessing the semantic similarity between documents, the proposed algorithm is tested with a sample set of documents excerpted from the Reuters-21578 corpus and the experiment results partly indicate that the proposed algorithm perform better than the standard ant-based text-clustering algorithm and the k-means algorithm.