首页> 外文会议>International Workshop on Computational Processing of the Portuguese Language >Towards a Statistical-Enriched Corpus Containing Portuguese Collocations in Use: Reviewing Possible Extraction Tools
【24h】

Towards a Statistical-Enriched Corpus Containing Portuguese Collocations in Use: Reviewing Possible Extraction Tools

机译:朝着使用中的葡萄牙展示的繁体统治核心:审查可能的提取工具

获取原文

摘要

Collocations are a main problem for any natural language processing task, from machine translation to summarization. With the goal of building a corpus with collocations, enriched with statistical information about them, we survey, in this paper, four tools for extracting collocations. These tools allow us to collect sentences with collocations, and also to gather statistics on this particular type of co-ocurrences, like Mutual Information and Log likelihood values.
机译:搭配是任何自然语言处理任务的主要问题,从机器翻译到总结。在本文中,我们调查了有关它们的统计信息的划分的展示统计信息的目标,有四个提取搭配的工具。这些工具允许我们收集与搭配的句子,也可以收集有关这种特定类型的联合血管的统计数据,如互信息和日志似然值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号