【24h】

Chinese Text Chunking Using Divide-Conquer Model

机译:基于分治模型的中文分词

获取原文
获取外文期刊封面目录资料

摘要

Traditional Chinese text chunking approach is to identify phrases using only one model and same features. It has been shown that the limitations of using only one model are that: the use of the same types of features is not suitable for all phrases, and data sparseness may also result. In this paper, the divide-conquer model is proposed and applied in the identification of Chinese phrases. This model divides the task of chunking into several sub-tasks according to sensitive features of each phrase and identifies different phrases in parallel. Then, a two-stage decreasing conflict strategy is used to synthesize each sub-task''s answer. Through testing on Chinese Penn Treebank, F score of Chinese chunking using Multi-agent strategy achieves to 95.82%, which is higher than the best result that has been reported.
机译:传统的中文文本分块方法是仅使用一种模型和相同功能来识别短语。已经显示出仅使用一种模型的局限性在于:使用相同类型的特征并不适合于所有短语,并且还可能导致数据稀疏。本文提出了一种分而治之模型,并将其应用于汉语短语的识别中。该模型根据每个短语的敏感特征将分块任务划分为几个子任务,并并行识别不同的短语。然后,采用两阶段减少冲突策略来综合每个子任务的答案。通过对中国宾州树银行的测试,使用多智能体策略的中文分块F得分达到95.82%,高于已报道的最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号