首页> 外文会议>6th workshop on ontologies and lexical resources. >Automatic Discovery of Feature Sets for Dependency Parsing
【24h】

Automatic Discovery of Feature Sets for Dependency Parsing

机译:自动发现功能集以进行依赖性分析

获取原文
获取原文并翻译 | 示例

摘要

This paper describes a search procedure to discover optimal feature sets for dependency parsers. The search applies to the shift–reduce algorithm and the feature sets are extracted from the parser configuration. The initial feature is limited to the first word in the input queue. Then, the procedure uses a set of rules founded on the assumption that topological neighbors of significant features in the dependency graph may also have a significant contribution. The search can be fully automated and the level of greediness adjusted with the number of features examined at each iteration of the discovery procedure. Using our automated feature discovery on two corpora, the Swedish corpus in CoNLL-X and the English corpus in CoNLL 2008, and a single parser system, we could reach results comparable or better than the best scores reported in these evaluations. The CoNLL 2008 test set contains, in addition to a Wall Street Journal (WSJ) section, an out-of-domain sample from the Brown corpus. With sets of 15 features, we obtained a labeled attachment score of 84.21 for Swedish, 88.11 on the WSJ test set, and 81.33 on the Brown test set.
机译:本文介绍了一种搜索过程,用于发现依赖解析器的最佳功能集。该搜索适用于移位减少算法,并且从解析器配置中提取功能集。初始功能仅限于输入队列中的第一个单词。然后,该过程使用一组规则,该规则基于以下假设:依赖关系图中具有重要特征的拓扑邻居也可能具有重要贡献。搜索可以是完全自动化的,并且可以通过在发现过程的每次迭代中检查的特征数量来调整贪婪程度。使用我们在两个语料库(CoNLL-X中的瑞典语语料库和CoNLL 2008中的英语语料库)以及单个解析器系统上的自动功能发现,我们可以获得的结果与这些评估中报告的最佳分数相当或更好。除了《华尔街日报》(WSJ)部分,CoNLL 2008测试集还包含来自布朗语料库的域外样本。通过15个特征集,我们获得了瑞典语的标记附件得分为84.21,《华尔街日报》测试集的得分为88.11,在布朗测试集的得分为81.33。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号