针对短文本特征稀疏、上下文依赖而导致的传统文本分类法应用效果不佳的问题,提出一种基于卡方特征和 BTM 的短文本分类法。首先提取短文本的卡方特征,再利用 BTM 对短文本建模,获得对应的文档-话题概率特征,最后融合两种特征并基于 SVM 分类算法实现短文本分类。实验结果表明,相比于常规分类方法,该方法具有较高的 Macro-F1值,对短文本的分类具有良好的效果。%Aiming at the shortage of traditional text classification method on account of text feature sparse and context dependency,a short text classification method based on Chi-square feature and BTM is proposed.Firstly,Chi-square features of short text are extracted,then it is modeled by BTM to get the corresponding document-topic probability features.Finally,the short text classi-fication is obtained by combining these two features and SVM classification algorithm.Experi-mental results show that this method has high Macro-F1 value compared to the conventional clas-sification method and verify that the method achieves a better performance in short text classifica-tion.
展开▼