首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Open Relation Extraction for Chinese Noun Phrases
【24h】

Open Relation Extraction for Chinese Noun Phrases

机译:中文名词短语的开放关系提取

获取原文
获取原文并翻译 | 示例

摘要

Relation Extraction (RE) aims at harvesting relational facts from texts. A majority of existing research targets at knowledge acquisition from sentences, where subject-verb-object structures are usually treated as the signals of existence of relations. In contrast, relational facts expressed within noun phrases are highly implicit. Previous works mostly relies on human-compiled assertions and textual patterns in English to address noun phrase-based RE. For Chinese, the corresponding task is non-trivial because Chinese is a highly analytic language with flexible expressions. Additionally, noun phrases tend to be incomplete in grammatical structures, where clear mentions of predicates are often missing. In this article, we present an unsupervised Noun Phrase-based Open RE system for the Chinese language (NPORE), which employs a three-layer data-driven architecture. The system contains three components, i.e., Modifier-sensitive Phrase Segmenter, Candidate Relation Generator and Missing Relation Predicate Detector. It integrates with a graph clique mining algorithm to chunk Chinese noun phrases, considering how relations are expressed. We further propose a probabilistic method with knowledge priors and a hypergraph-based random walk process to detect missing relation predicates. Experiments over Chinese Wikipedia show NPORE outperforms state-of-the-art, capable of extracting 55.2 percent more relations than the most competitive baseline, with a comparable precision at 95.4 percent.
机译:关系提取(重新)旨在收获来自文本的关系事实。从句子的知识获取的大多数现有研究目标,其中受试者动词对象结构通常被视为存在关系存在的信号。相比之下,名词短语中表达的关系事实非常隐含。以前的作品主要依赖于英语中的人工编译的断言和文本模式来解决基于名词短语的RE。对于中文,相应的任务是非琐碎的,因为中文是具有灵活表达式的高度分析语言。此外,名词短语在语法结构方面往往是不完整的,其中谓词的清晰提升通常丢失。在本文中,我们为中文(Nopore)提供了一个无监督的基于名词的开放式RE系统,它采用了三层数据驱动的架构。该系统包含三个组件,即修饰语敏感短语分段器,候选关系生成器和缺失关系谓词检测器。它与Traph Clique挖掘算法集成到Chunk中文名词短语,考虑到如何表达关系。我们进一步提出了一种具有知识电视机的概率方法和基于超图的随机步行过程,以检测缺失的关系谓词。中国维基百科的实验表明NPORE优于最先进的最先进的,能够比最竞争力的基线更多地提取55.2%,具有相当的精度为95.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号