首页> 外文会议>International Conference on Data Mining >Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents
【24h】

Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents

机译:来自用户生成的非结构化(文本)文档的性别歧义的监督机器学习方法

获取原文
获取外文期刊封面目录资料

摘要

The rise of social media has led to an explosive growth in the size of data generated, data growth has undergone a renaissance. The pervasive espousal of social media into our daily lives has opened many opportunities for researchers to deep dive into human behavior. Many aspects of human behaviors have been explored using media data, example, detecting and monitoring mood state, forecasting sentiment analysis etc. Another important aspect of human behavior where a significant interest lies is identification of author identity. Predicting author characteristics, preferences and opinions helps answer many social science questions and support many commercial applications especially in e-commerce business. Identifying gender using author names and profile names by Twitter and Google are some examples of many advances in this area. Our work in this research takes it to the fore with ability to even classify anonymous users or authors. It is engrossed towards disambiguating author gender through lexical choice, choice of syntactic structure, capitalizing on linguistic nuances and textual meaning. The results of our research are quite promising and endure witness to the validity of approach.
机译:社交媒体的兴起导致了产生的数据规模的爆炸性增长,数据增长经历了文艺复兴。社交媒体的普遍服务署为我们的日常生活开辟了研究人员对人类行为深处的许多机会。使用媒体数据,示例,检测和监测情绪状态,预测情绪分析等探索了人类行为的许多方面。人类行为的另一个重要方面,其中一个重要的利益谎言是识别作者身份。预测作者特征,偏好和意见有助于回答许多社会科学问题,并支持许多商业应用,特别是在电子商务业务中。通过Twitter使用作者姓名和配置文件名称来识别性别以及Google是该领域许多进步的一些示例。我们在本研究中的工作将其与甚至对匿名用户或作者分类的能力。通过词汇选择,句法结构的选择,利用语言细微差别和文本意义来消除作者性别。我们的研究结果非常有前途和忍受对方法的有效性的见证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号