Fast Big Textual Data Parsing in Distributed and Parallel Computing Environment

机译：分布式和并行计算环境中的快速大文本数据解析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Currently, tremendous numbers of scientific and technical articles are being published due to the rapid development of the scientific and technical fields. Also, systems are being proposed which can give useful information to users by extracting information from scientific and technical articles. For such systems, we need to be able to extract information from a massive number of documents very fast and reliably. However, legacy parsers, such as Stanford, Enju and so on, cannot consider a large number of documents because such parsers analyze wide context range of the sentence for their parsing, and so those parsers require a lot of time to run. Therefore, in this paper, we report on the development of a parser which is based on MapReduce, a distributed and parallel programming model. Our parser has achieved about nineteen times better performance than that of one of the-state-of-the-art legacy parsers.

机译：目前，由于科学和技术领域的快速发展，巨大数量的科学和技术文章正在发布。而且，正在提出系统，其可以通过从科学和技术文章中提取信息来向用户提供有用的信息。对于这样的系统，我们需要能够非常快速可靠地从大量文件中提取信息。但是，传统解析器（如斯坦福，Enju等）不能考虑大量文档，因为这样的解析器分析了他们解析的句子的广泛上下文范围，因此那些解析器需要花费大量时间来运行。因此，在本文中，我们报告了基于MapReduce，分布式和并行编程模型的解析器的开发。我们的解析器已经实现了大约一九次的性能，而不是最先进的传统解析器。

著录项

来源
《FTRA international conference on mobile, ubiquitous, and intelligent computing》|2014年||共5页
会议地点
作者
Jung-Ho Um; Chang-Hoo Jeong; Sung-Pil Choi; Seungwoo Lee; Hanmin Jung;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
distributed and parallel computing; big textual data; parsing; MapReduce;

机译：分布式和并行计算;大文本数据;parsing;mapreduce;

相似文献

外文文献
中文文献
专利

1. Distributed and Parallel Big Textual Data Parsing for Social Sensor Network [J] . Jung-HoUm, Chang-HooJeong, Sung-PilChoi, International Journal of Distributed Sensor Networks . 2013,第3期

机译：社交传感器网络的分布式并行大文本数据解析
2. Fast distributed and parallel pre-processing on massive satellite data using grid computing [J] . Wongoo Lee, Yunsoo Choi, Kangryul Shon, 中南大学学报（英文版） . 2014,第010期

机译：使用网格计算对海量卫星数据进行快速分布式和并行预处理
3. Data placement in massively distributed environments for fast parallel mining of frequent itemsets [J] . Salah Saber, Akbarinia Reza, Masseglia Florent Knowledge and information systems . 2017,第1期

机译：频繁分布式环境中大型分布式环境中的数据放置
4. Fast Big Textual Data Parsing in Distributed and Parallel Computing Environment [C] . Jung-Ho Um, Chang-Hoo Jeong, Sung-Pil Choi, FTRA international conference on mobile, ubiquitous, and intelligent computing . 2014

机译：分布式和并行计算环境中的快速大文本数据解析
5. Scalable parallel computing on clouds: Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations on cloud environments. [D] . Gunarathne, Thilina. 2014

机译：云上的可伸缩并行计算：高效且可伸缩的架构，可在云环境上执行令人满意的并行，MapReduce和迭代式数据密集型计算。
6. Using RxNorm for Cross-institutional Formulary Data Normalization Within a Distributed Grid-computing Environment [O] . Rob Wynden, Nick Anderson, Marco Casale, 2011

机译：在分布式网格计算环境中使用RxNorm进行跨机构配方数据标准化
7. Distributed and Parallel Big Textual Data Parsing for Social Sensor Network [O] . Jung-Ho Um, Chang-Hoo Jeong, Sung-Pil Choi, 2013

机译：社交传感器网络的分布式和并行大文本数据解析
8. Implementation of a Pseudo-Bending Seismic Travel-Time Calculator in a Distributed Parallel Computing Environment [R] . Ballard, S., Young, C., Hipp, J., 2008

机译：在分布式并行计算环境中实现伪弯曲地震旅行时计算器

Fast Big Textual Data Parsing in Distributed and Parallel Computing Environment

摘要

著录项

相似文献

相关主题

期刊订阅