首页> 外文会议>Insternational Joint Conference on Natural Language Processing >Data-Oriented Parsing and the Penn Chinese Treebank
【24h】

Data-Oriented Parsing and the Penn Chinese Treebank

机译:以数据为导向的解析和Penn Chinese TreeBank

获取原文
获取外文期刊封面目录资料

摘要

We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DOP) approach. DOP comprises an experience-based approach to natural language parsing. Most published research in the DOP framework uses PS-trees as its representation schema. Drawbacks of the DOP approach centre around issues of efficiency. We incorporate recent advances in DOP parsing techniques into a novel DOP parser which generates a compact representation of all subtrees which can be derived from any full parse tree. We compare our work to previous work on parsing the Penn Chinese Treebank, and provide both a quantitative and qualitative evaluation. While our results in terms of Precision and Recall are slightly below those published in related research, our approach requires no manual encoding of head rules, nor is a development phase per se necessary. We also note that certain constructions which were problematic in this previous work can be handled correctly by our DOP parser. Finally, we observe that the 'DOP Hypothesis' is confirmed for parsing the Penn Chinese Treebank.
机译:我们使用以数据导向的解析(DOP)方法对Penn Chinese TreeBank进行解析。 DOP包括基于体验的自然语言解析方法。 DOP框架中大多数已发表的研究使用PS树作为其代表模式。 DOP方法中心周围效率问题的缺点。我们将DOP解式技术的最新进展纳入了一种新的DOP解析器,该DOP解析器产生了可以从任何完整解析树导出的所有子树的紧凑型表示。我们将我们的工作与以前的工作进行比较,以解析Penn Chinese TreeBank,并提供定量和定性评估。虽然我们在精确和召回方面的结果略低于相关研究中发表的结果,但我们的方法不需要手动编码头部规则,也不需要开发阶段。我们还注意到,我们的DOP解析器可以正确地处理此前工作中存在的某些结构。最后,我们观察到“DOP假设”被证实解析了Penn Chinese TreeBank。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号