Extraction Research about Parallelization of Named Entity Based on Hadoop Platform

机译：基于Hadoop平台的命名实体并行化的提取研究

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With the era of big data approaching, data becomes more and more important. Faced with such massive amounts of data space, how to quickly identify the contents of a field that the users are interest in and extract them out, is an urgent problem to be solved. To identify the content that users are interested in, we can use NLPIR Chinese word segmentation framework for speech segmentation, and identify named entity according to part of speech tagging. For extraction, using Hadoop, parallel cluster platform based on a big data MapReduce framework, using the Hadoop Distributed File System (HDFS) for efficient data access and starting Map and Reduce tasks to extract the information of named entity. This task extracts the required information from the interactive encyclopedia and then stores them in the knowledge base. It implements the task of extracting the information data of parallelization of named entity based on Hadoop platform.

机译：随着大数据的时代，数据变得越来越重要。面对如此大量的数据空间，如何快速识别用户对用户感兴趣并提取它们的领域的内容，是要解决的迫切问题。要识别用户对用户感兴趣的内容，我们可以使用NLPIR中文字分段框架进行语音分割，并根据语音标记的一部分标识命名实体。对于提取，使用Hadoop，并行群集平台基于大数据MapReduce框架，使用Hadoop分布式文件系统（HDFS）进行高效的数据访问和起始地图并减少提取命名实体信息的任务。此任务从交互式百科全书中提取所需信息，然后将其存储在知识库中。它实现了基于Hadoop平台提取命名实体并行化信息数据的任务。

著录项

来源
《International Conference on Advanced Design and Manufacturing Engineering》|2013年||共4页
会议地点
作者
Quan Shi; Zhendong Yang; Lu Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TH16-53;
关键词
Hadoop; Chinese word segmentation; Named entity recognition; Named entity extraction;

机译：Hadoop;中文字分割;命名实体识别;命名实体提取;

相似文献

外文文献
中文文献
专利

1. Massive parallel sequencing uncovers actionable FGFR2–PPHLN1 fusion and ARAF mutations in intrahepatic cholangiocarcinoma [J] . Daniela Sia, Bojan Losic, Agrin Moeini, Nature Communications . 2015,第1期

机译：大规模并行测序发现可行的 FGFR2 – PPHLN1 融合和 <肝内胆管癌的named-entity> ARAF 突变
2. Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform [J] . Jianfang Cao, Lichao Chen, Min Wang, Computational intelligence and neuroscience . 2018,第3期

机译：在Hadoop平台上实现基于Otsu-Canny运算符的并行图像边缘检测算法
3. Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform [J] . Jianfang Cao, Lichao Chen, Min Wang, Computational intelligence and neuroscience . 2018,第Pta1期

机译：基于Hadoop平台上的Otsu-Canny运算符实现并行图像边缘检测算法
4. Extraction Research about Parallelization of Named Entity Based on Hadoop Platform [C] . Quan Shi, Zhendong Yang, Lu Xu International Conference on Advanced Design and Manufacturing Engineering . 2013

机译：基于Hadoop平台的命名实体并行化的提取研究
5. Using a named entity tagger and a syntactic parser to improve Web-based answer extraction [D] . Kamel, Yasser. 2004

机译：使用命名实体标记器和语法解析器来改进基于Web的答案提取
6. Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform [O] . Jianfang Cao, Lichao Chen, Min Wang, 2018

机译：在Hadoop平台上实现基于Otsu-Canny运算符的并行图像边缘检测算法
7. Chinese Named Entity Extraction System Based On Word2vec Under Spark Platform [O] . Jialu Yuan, Yongping Xiong 2016

机译：基于Spark平台Word2Vec的中文命名实体提取系统

Extraction Research about Parallelization of Named Entity Based on Hadoop Platform

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅