首页> 外文会议>International Conference on Data Mining >Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents

【24h】

Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents

机译：来自用户生成的非结构化（文本）文档的性别歧义的监督机器学习方法

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The rise of social media has led to an explosive growth in the size of data generated, data growth has undergone a renaissance. The pervasive espousal of social media into our daily lives has opened many opportunities for researchers to deep dive into human behavior. Many aspects of human behaviors have been explored using media data, example, detecting and monitoring mood state, forecasting sentiment analysis etc. Another important aspect of human behavior where a significant interest lies is identification of author identity. Predicting author characteristics, preferences and opinions helps answer many social science questions and support many commercial applications especially in e-commerce business. Identifying gender using author names and profile names by Twitter and Google are some examples of many advances in this area. Our work in this research takes it to the fore with ability to even classify anonymous users or authors. It is engrossed towards disambiguating author gender through lexical choice, choice of syntactic structure, capitalizing on linguistic nuances and textual meaning. The results of our research are quite promising and endure witness to the validity of approach.

机译：社交媒体的兴起导致了产生的数据规模的爆炸性增长，数据增长经历了文艺复兴。社交媒体的普遍服务署为我们的日常生活开辟了研究人员对人类行为深处的许多机会。使用媒体数据，示例，检测和监测情绪状态，预测情绪分析等探索了人类行为的许多方面。人类行为的另一个重要方面，其中一个重要的利益谎言是识别作者身份。预测作者特征，偏好和意见有助于回答许多社会科学问题，并支持许多商业应用，特别是在电子商务业务中。通过Twitter使用作者姓名和配置文件名称来识别性别以及Google是该领域许多进步的一些示例。我们在本研究中的工作将其与甚至对匿名用户或作者分类的能力。通过词汇选择，句法结构的选择，利用语言细微差别和文本意义来消除作者性别。我们的研究结果非常有前途和忍受对方法的有效性的见证。

著录项

来源
《International Conference on Data Mining》|2015年||共7页
会议地点
作者
Amit Choudhary; Praveen Kumar; Sridhar Jeyaraman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP274.2-53;
关键词

相似文献

外文文献
中文文献
专利

1. Comparing a knowledge-driven approach to a supervised machine learning approach in large-scale extraction of drug-side effect relationships from free-text biomedical literature [J] . Rong Xu, QuanQiu Wang BMC Bioinformatics . 2015,第SUPPLEMENTa5期

机译：从大规模的自由文本生物医学文献中比较知识驱动的方法与有监督的机器学习方法以大规模提取药物副作用的关系
2. Disentangling User Samples: A Supervised Machine Learning Approach to Proxy-population Mismatch in Twitter Research [J] . K. Hazel Kwon, J. Hunter Priniski, Monica Chadha Communication Methods and Measures . 2018,第2期

机译：解开用户样本：Twitter研究中的代理人口不匹配的监督机器学习方法
3. An Approach for Generating Pattern-Based Shorthand Using Speech-to-Text Conversion and Machine Learning [J] . K. R. Abhinand, H. K. Anasuya Devi Journal of Intelligent Systems . 2013,第3期

机译：一种基于语音到文本的转换和机器学习的基于模式的速记生成方法
4. Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents [C] . Amit Choudhary, Praveen Kumar, Sridhar Jeyaraman International Conference on Data Mining . 2015

机译：来自用户生成的非结构化（文本）文档的性别歧义的监督机器学习方法
5. Measuring Content Quality in User Generated Content Systems: a Machine Learning Approach. [D] . Javanmardi, Sara. 2011

机译：在用户生成的内容系统中测量内容质量：一种机器学习方法。
6. Extracting Diagnoses and Investigation Results from Unstructured Text in Electronic Health Records by Semi-Supervised Machine Learning [O] . Zhuoran Wang, Anoop D. Shah, A. Rosemary Tate, 2009

机译：通过半监督机器学习从电子病历中的非结构化文本中提取诊断和调查结果
7. Extracting Diagnoses and Investigation Results from Unstructured Text in Electronic Health Records by Semi- Supervised Machine Learning [O] . Zhuoran Wang, Anoop D. Shah, A. Rosemary Tate, 2014

机译：通过半监督机器学习从电子病历中非结构化文本中提取诊断和调查结果

Supervised Machine Learning Approach for Gender Disambiguation from User Generated Unstructured (Text) Documents

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅