首页> 外文会议>International Conference on Information, Intelligence, Systems and Applications >Effect of different feature types on age based classification of short texts
【24h】

Effect of different feature types on age based classification of short texts

机译:不同特征类型对短文本年龄的基于年龄的影响

获取原文

摘要

The aim of the current study is to compare the effect of three different feature types for age-based categorization of short texts as average 85 words per author. Besides widely used word and character n-grams, text readability features are proposed as an alternative. By readability features we mean different relative ratios of text elements as characters per word, words per sentence, etc. Support Vector Machines, Logistic Regression, and Bayesian algorithms were used to build models. Most effective features were readability features and character n-grams. Model generated by Support Vector Machine and combined feature set yield to f-score 0.968. Age prediction application was built using a model with readability features.
机译:目前研究的目的是将三种不同特征类型的效果与年龄为基于年龄的短信分类的效果,平均每作者的85个单词。除了广泛使用的单词和字符n-gram外,提出了文本可读性功能作为替代方案。通过可读性功能,我们的意思是文本元素的不同比例作为每个单词的字符,单句单词等。支持向量机,逻辑回归和贝叶斯算法用于构建模型。最有效的功能是可读性特征和字符n-grams。支持向量机生成的模型和组合特征设置为F-Score 0.968。使用具有可读性功能的模型构建年龄预测应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号