A classifier model based on word terms was proposed to classify Spam Short Messages (SSM). The concept of word-category weight was introduced for representing a word effect of identifying the category a SSM belongs to and a method was put forward to calculate the word-category weight. Based on the word-category weight, a dimension reduction was carried out to get word items set. The Short message-Category Membership Value (SCMV) was used to illustrate how much a SSM belonged to a category, then a classifying algorithm was implemented by computing SCMV and SCMV density. To improve the accuracy of classification and make the word-category weight more reasonable, an word-weight iterative learning procedure was performed. The experimental results show that the proposed model is superior to other classification methods in terms of classification performance and time complexity.%针对垃圾短信分类问题,提出一种计算词分类权重的方法,并以此为基础通过降维来得到分类特征词集合.提出了短信分类隶属度概念,通过计算短信分类隶属度和分类隶属度密度的方法来实现分类.为了提高分类的准确性,还对特征词进行了分类权重的迭代学习,从而保证了词分类权重取值的合理性.实验结果表明,该分类模型具有良好的分类效果和较低的时间复杂度.
展开▼