基于用户画像的广告定向技术普遍应用于品牌展示和精准竞价广告,然而现有的用户搜索画像技术存在着特征维度大、矩阵稀疏的问题.针对这一问题,本文采用卡方检验和线性核支持向量机相结合的方法,首先利用结巴分词对搜索文本预处理,其次采用卡方检验进行特征选择,并采用支持向量机分类算法进行属性判定,最后进行了实验对比.实验表明卡方检验有效降低了特征维度,并提升了分类准确度;支持向量机在矩阵稀疏上分类性能优于其他常用的文本分类算法.%The technology of orient advertising based on user profile is widely used in brand display and precision auction advertising.But the current user profile technology is facing the problem of large feature dimension and sparse matrix.To solve this problem,we combined the Chi-square test and linear kernel support vector machine (SVM),firstly we cut the search text with the help of jieba,then the chi square test was used for feature selection,and the support vector machine was used to determine attributes,finally contrastive experiments.Experimental results show that the chi square test can effectively reduce the dimensionality,and improve the classification accuracy.SVM performance better than other generally text classification algorithm in sparse matrix.
展开▼