Word sense disambiguation by learning decision trees from unlabeled data

Park SB.; Zhang BT.; Kim YT.

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >Word sense disambiguation by learning decision trees from unlabeled data

【24h】

Word sense disambiguation by learning decision trees from unlabeled data

机译：通过从未标记的数据中学习决策树来消除单词的歧义

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we describe a machine learning approach to word sense disambiguation that uses unlabeled data. Our method is based on selective sampling with committees of decision trees. The committee members are trained on a small set of labeled examples which are then augmented by a large number of unlabeled examples. Using unlabeled examples is important because obtaining labeled data is expensive and time-consuming while it is easy and inexpensive to collect a large number of unlabeled examples. The idea behind this approach is that the labels of unlabeled examples can be estimated by using committees. Using additional unlabeled examples, therefore, improves the performance of word sense disambiguation and minimizes the cost of manual labeling. Effectiveness of this approach was examined on a raw corpus of one million words. Using unlabeled data, we achieved an accuracy improvement up to 20.2%. [References: 40]

机译：在本文中，我们描述了一种使用未标记数据的机器学习方法来消除单词歧义。我们的方法基于决策树委员会的选择性抽样。委员会成员接受了一小组带标签的示例的培训，然后再加上大量未标记的示例。使用未标记的示例非常重要，因为获取标记的数据既昂贵又费时，而收集大量未标记的示例又容易又便宜。这种方法背后的思想是可以通过使用委员会来估计未标记示例的标签。因此，使用其他未标记的示例可以改善单词歧义消除的性能，并最大程度地减少手动标记的成本。对一百万个单词的原始语料库检查了这种方法的有效性。使用未标记的数据，我们将准确度提高了20.2％。 [参考：40]

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2003年第2期|共12页
作者
Park SB.; Zhang BT.; Kim YT.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
Word sense disambiguation; Learning from unlabeled examples; Selective sampling; Committee learning; Decision tree; Algorithm; Selection;

机译：词义歧义;从未标记示例中学习;选择性抽样;委员会学习;决策树;算法;选择;

相似文献

外文文献
中文文献
专利

1. Word sense disambiguation by learning decision trees from unlabeled data [J] . Park SB., Zhang BT., Kim YT. Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2003,第1a2期

机译：通过从未标记的数据中学习决策树来消除单词的歧义
2. Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation [J] . Ba#351, kaya Osman, Jurgens David The Journal of Artificial Intelligence Research . 2016,第10期

机译：半监督学习与诱导词义相结合，可实现最先进的词义歧义消除
3. Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation [J] . Baskaya Osman, Jurgens David The Journal of Artificial Intelligence Research . 2016,第Null期

机译：半监督学习与诱导词义相结合，可实现最先进的词义歧义消除
4. Word Sense Disambiguation by Learning from Unlabeled Data [C] . Seong-Bae Park, Byoung-Tak Zhang, Yung Taek Kim 38th Annual Meeting of the Association for Computational Linguistics, Oct 1-8, 2000, Hong Kong . 2000

机译：通过学习未标记的数据来消除词义
5. Semantics and result disambiguation for keyword search on tree data. [D] . Aksoy, Cem. 2016

机译：对树数据进行关键词搜索的语义和结果歧义消除。
6. Interactive medical word sense disambiguation through informed learning [O] . Yue Wang, Kai Zheng, Hua Xu, 2018

机译：通过知情学习进行交互式医学单词感消除
7. Semi-Supervised Japanese Word Sense Disambiguation Based on Two-Stage Classification of Unlabeled Data and Ensemble Learning [O] . Tatsukuni Inoue, Hiroaki Saito 2011

机译：基于未标记数据和集合学习的两阶段分类的半监督日语词语消歧

Word sense disambiguation by learning decision trees from unlabeled data

摘要

著录项

相似文献

相关主题

期刊订阅