...
首页> 外文期刊>Computational linguistics >Data-Intensive Text Processing with MapReduce Jimmy Lin and Chris Dyer
【24h】

Data-Intensive Text Processing with MapReduce Jimmy Lin and Chris Dyer

机译:使用MapReduce进行数据密集型文本处理Jimmy Lin和Chris Dyer

获取原文

摘要

The world has been blessed by the ever-growing World Wide Web and the asso- ciated vast amount of information available through commercial search engines. Many subfields of computational linguistics, such as speech recognition, machine translation, summarization, coreference resolution, question answering, word sense disambiguation, and so on, are playing increasingly important roles in information extraction from the Web. At the same time, the Web itself is also providing more and more data for computational linguists to study. For example, in the past decade, the amount of text available for speech recognition and machine translation research has increased by several orders of magnitude. Corpora consisting of hundreds of millions of words are not uncommon anymore. This clearly poses serious challenges in terms of computation for traditional text processing approaches using a single computer. As a result, efficient distributed computing has become more crucial than ever.
机译:不断增长的万维网以及通过商业搜索引擎可获得的大量相关信息,使世界受益匪浅。计算语言学的许多子领域,例如语音识别,机器翻译,摘要,共指解析,问题解答,词义消歧等,在从Web提取信息中发挥着越来越重要的作用。同时,Web本身也提供了越来越多的数据供计算语言学家学习。例如,在过去的十年中,可用于语音识别和机器翻译研究的文本数量增加了几个数量级。由数以亿计的单词组成的语料库已不再罕见。对于使用单个计算机的传统文本处理方法,这显然在计算方面提出了严峻的挑战。结果,高效的分布式计算比以往任何时候都变得至关重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号