首页> 外文会议>FTRA international conference on mobile, ubiquitous, and intelligent computing >Fast Big Textual Data Parsing in Distributed and Parallel Computing Environment
【24h】

Fast Big Textual Data Parsing in Distributed and Parallel Computing Environment

机译:分布式和并行计算环境中的快速大文本数据解析

获取原文

摘要

Currently, tremendous numbers of scientific and technical articles are being published due to the rapid development of the scientific and technical fields. Also, systems are being proposed which can give useful information to users by extracting information from scientific and technical articles. For such systems, we need to be able to extract information from a massive number of documents very fast and reliably. However, legacy parsers, such as Stanford, Enju and so on, cannot consider a large number of documents because such parsers analyze wide context range of the sentence for their parsing, and so those parsers require a lot of time to run. Therefore, in this paper, we report on the development of a parser which is based on MapReduce, a distributed and parallel programming model. Our parser has achieved about nineteen times better performance than that of one of the-state-of-the-art legacy parsers.
机译:目前,由于科学和技术领域的快速发展,巨大数量的科学和技术文章正在发布。而且,正在提出系统,其可以通过从科学和技术文章中提取信息来向用户提供有用的信息。对于这样的系统,我们需要能够非常快速可靠地从大量文件中提取信息。但是,传统解析器(如斯坦福,Enju等)不能考虑大量文档,因为这样的解析器分析了他们解析的句子的广泛上下文范围,因此那些解析器需要花费大量时间来运行。因此,在本文中,我们报告了基于MapReduce,分布式和并行编程模型的解析器的开发。我们的解析器已经实现了大约一九次的性能,而不是最先进的传统解析器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号