A Density-Based Re-ranking Technique for Active Learning for Data Annotations

机译：基于密度的重新排序技术，用于数据标注的主动学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

One of the popular techniques of active learning for data annotations is uncertainty sampling, however, which often presents problems when outliers are selected. To solve this problem, this paper proposes a density-based re-ranking technique, in which a density measure is adopted to determine whether an unlabeled example is an outlier. The motivation of this study is to prefer not only the most informative example in terms of uncertainty measure, but also the most representative example in terms of density measure. Experimental results of active learning for word sense disambiguation and text classification tasks using six real-world evaluation data sets show that our proposed density-based re-ranking technique can improve uncertainty sampling.

机译：主动学习中用于数据注释的流行技术之一是不确定性采样，但是，在选择离群值时通常会出现问题。为了解决这个问题，本文提出了一种基于密度的重排序技术，其中采用密度度量来确定未标记的示例是否是异常值。这项研究的动机是，不仅在不确定性度量方面更喜欢提供最多信息的示例，而且在密度度量方面也更喜欢具有代表性的示例。使用六个真实世界的评估数据集进行主动学习以进行词义消歧和文本分类任务的实验结果表明，我们提出的基于密度的重排序技术可以改善不确定性采样。

著录项

来源
《Computer processing of oriental languages : Language technology for the Knowledge-based economy》|2009年|P.1-10|共10页
会议地点 Hong Kong(CN);Hong Kong(CN)
作者
Jingbo Zhu; Huizhen Wang; Benjamin K. Tsou;
展开▼
作者单位

Natural Language Processing Laboratory, Northeastern University, Shenyang, P.R. China;

rnNatural Language Processing Laboratory, Northeastern University, Shenyang, P.R. China;

rnLanguage Information Sciences Research Centre, City University of Hong Kong;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言;
关键词
active learning; uncertainty sampling; density-based re-ranking; data annotation; text classification; word sense disambiguation;

机译：主动学习;不确定性采样基于密度的重新排名；数据注释；文字分类词义消歧;

相似文献

外文文献
中文文献
专利

1. Active learning and data manipulation techniques for generating training examples in meta-learning [J] . Sousa Arthur F. M., Prudencio Ricardo B. C., Ludermir Teresa B., Neurocomputing . 2016,第juna19期

机译：主动学习和数据处理技术，用于在元学习中生成训练示例
2. Improving active learning by data balance to reduce annotation efforts [J] . Lei Han, Wang Shuai, Zheng Dezhi, . 2019,第23期

机译：通过数据余额提高主动学习，以减少注释工作
3. Active Learning With Sampling by Uncertainty and Density for Data Annotations [J] . Zhu J., Wang H., Tsou B.K., Audio, Speech, and Language Processing, IEEE Transactions on . 2010,第6期

机译：通过不确定性和密度抽样进行主动学习的数据注释
4. A Density-Based Re-ranking Technique for Active Learning for Data Annotations [C] . Jingbo Zhu, Huizhen Wang, Benjamin K. Tsou International Conference on Computer Processing of Oriental Languages . 2009

机译：基于密度的重新排名技术，用于数据注释的主动学习
5. Genome Annotation Using Data Mining Techniques. [D] . Zhang, En. 2010

机译：使用数据挖掘技术进行基因组注释。
6. Combining active learning and semi-supervised learning techniques to extract protein interaction sentences [O] . Min Song, Hwanjo Yu, Wook-Shin Han 2011

机译：结合主动学习和半监督学习技术提取蛋白质相互作用句
7. Application of Machine Learning Techniques to the Re-ranking of Search Results [O] . Martin Buchholz, Dirk P uger, Josiah Poon 2014

机译：机器学习技术在搜索结果重新排序中的应用

A Density-Based Re-ranking Technique for Active Learning for Data Annotations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅