Inferring latent attributes of Twitter users with label regularization

机译：使用标签正则化推断Twitter用户的潜在属性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Inferring latent attributes of online users has many applications in public health, politics, and marketing. Most existing approaches rely on supervised learning algorithms, which require manual data annotation and therefore are costly to develop and adapt over time. In this paper, we propose a lightly supervised approach based on label regularization to infer the age, ethnicity, and political orientation of Twitter users. Our approach learns from a heterogeneous collection of soft constraints derived from Census demographics, trends in baby names, and Twitter accounts that are emblematic of class labels. To counteract the imprecision of such constraints, we compare several constraint selection algorithms that optimize classification accuracy on a tuning set. We find that using no user-annotated data, our approach is within 2% of a fully supervised baseline for three of four tasks. Using a small set of labeled data for tuning further improves accuracy on all tasks.

机译：推断在线用户的潜在属性在公共卫生，政治和市场营销中有许多应用。现有的大多数方法都依赖于监督学习算法，该算法需要人工注释数据，因此随着时间的推移开发和适应成本很高。在本文中，我们提出了一种基于标签正则化的轻度监督方法，以推断Twitter用户的年龄，种族和政治倾向。我们的方法是从人口普查人口统计资料，婴儿名字的趋势以及代表班级标签的Twitter帐户衍生的各种软约束中学习的。为了抵消这种约束的不精确性，我们比较了几种约束选择算法，这些算法可以优化调整集上的分类精度。我们发现，在不使用用户注释数据的情况下，我们的方法在四个任务中的三个任务的完全监督基线的2％以内。使用少量带标签的数据进行调整可以进一步提高所有任务的准确性。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 》|2015年|185-195|共11页
会议地点
作者
Ehsan Mohammady Ardehaly; Aron Culotta;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Latent Attribute Inference of Users in Social Media with Very Small Labeled Dataset [J] . Ding XIAO, Rui WANG, Lingling WU IEICE transactions on information and systems . 2016 ,第10期

机译：具有很小标签数据集的社交媒体中用户的潜在属性推断
2. Inferring the home locations of Twitter users based on the spatiotemporal clustering of Twitter data [J] . Lin Jie, Cromley Robert G. Transactions in GIS: TG . 2018 ,第1期

机译：基于推特数据的时空群集推断推特用户的家庭位置
3. Exploring multiple evidence to infer users' location in Twitter [J] . Rodrigues Erica, Assuncao Renato, Pappa Gisele L., Neurocomputing . 2016 ,第JANa1期

机译：探索多种证据推断用户在Twitter中的位置
4. Inferring latent attributes of Twitter users with label regularization [C] . Ehsan Mohammady Ardehaly, Aron Culotta Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2015

机译：使用标签正则化推断推特用户的潜在属性
5. Predicting latent demographic attributes of Twitter users. [D] . Frolov, Georgiy. 2016

机译：预测Twitter用户的潜在人口统计属性。
6. The gene patent controversy on Twitter: a case study of Twitter users’ responses to the CHEO lawsuit against Long QT gene patents [O] . Li Du, Kalina Kamenova, Timothy Caulfield 2015

机译：Twitter上的基因专利争议：以Twitter用户对针对长QT基因专利的CHEO诉讼的回应为例
7. Using county demographics to infer attributes of twitter users [O] . Ehsan Mohammady, Aron Culotta 2014

机译：使用县人口统计数据推断推特用户的属性

Inferring latent attributes of Twitter users with label regularization

摘要

著录项

相似文献

相关主题

期刊订阅