首页> 外文会议>Natural language processing Pacific Rim symposium >Taggers for Unknown Words using Decision Tree and Lazy Learning
【24h】

Taggers for Unknown Words using Decision Tree and Lazy Learning

机译:使用决策树和懒惰学习的未知单词的标签

获取原文

摘要

This paper describes methods of tagging unknown words using decision tree induction and lazy learning as post processes of morphological analysis. Unknown words are words which cannot be looked up in a dictionary of a natural language processing system. These are often named entities such as the name of a person, organization, or location. And these words play an important role in the application of information extraction. In this paper, we apply automated taggers that learn from training data which is written by hand. The algorithms of the taggers are a decision tree generated by C4.5 and a lazy learning algorithm. Attributes used for accurately tagging are notation of unknown words and the part of speech of neighboring words, and tag sets we classify these words are noun and some named entities. The experimental result of evaluating this method shows correct answers of 65% using decision tree tagger, and 70% using lazy learning tagger in open tests. We compared the decision tree tagger and the lazy learning tagger to show the advantages and disadvantages of each.
机译:本文介绍了使用决策树感应和懒惰学习标记未知单词的方法作为形态分析的后工艺。未知单词是在自然语言处理系统的字典中无法查找的单词。这些通常是名称的实体,例如人,组织或位置的名称。这些话在信息提取中发挥着重要作用。在本文中,我们将自动标签应用,该标签用于从手工编写的培训数据中学习。标签器的算法是由C4.5和懒惰学习算法生成的决策树。用于准确标记的属性是符合未知单词的符号和邻近单词的词性,而标签集我们分类这些单词是名词和一些命名实体。评估该方法的实验结果显示了使用决策树标记器的65%的正确答案,并使用懒惰学习标签在开放测试中使用70%。我们比较了决策树标签和懒惰的学习标签,以展示每个的优缺点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号