Discovering Compound and Proper Nouns

机译：发现复合名词和专有名词

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The identification of appropriate text tokens (words or sequences of words representing concepts) is one of the most important tasks of text preprocessing and may have great influence on the final results of text analysis. In our paper, we introduce a new approach to discovering compound nouns, including proper compound nouns. Our approach combines the data mining methods with shallow lexical analysis. We propose a simple pattern language for specifying grammatical patterns to be satisfied by extracted compound nouns. Our method requires annotating the words with part of speech tags, thus to this extent, it is language-dependent. Based on the data mining GSP algorithm, we propose T-GSP as its modification for extracting frequent text patterns, and in particular, frequent word sequences that satisfy given grammatical rules. The obtained sequences are regarded as candidates for compound nouns. The experiments have proven very high quality of the method.

机译：适当的文本标记（代表概念的单词或单词序列）的标识是文本预处理的最重要任务之一，并且可能对文本分析的最终结果产生重大影响。在本文中，我们介绍了一种发现复合名词（包括专有复合名词）的新方法。我们的方法将数据挖掘方法与浅层词法分析相结合。我们提出了一种简单的模式语言，用于指定提取的复合名词要满足的语法模式。我们的方法需要使用部分语音标签来注释单词，因此在某种程度上取决于语言。基于数据挖掘GSP算法，我们提出T-GSP作为其改进，用于提取频繁的文本模式，尤其是满足给定语法规则的频繁单词序列。所获得的序列被视为复合名词的候选。实验证明该方法的质量很高。

著录项

来源
《International Conference on Rough Sets and Intelligent Systems Paradigms(RSEISP 2007); 20070628-30; Warsaw(PL)》|2007年|P.505-515|共11页
会议地点 Warsaw(PL)
作者
Grzegorz Protaziuk; Marzena Kryszkiewicz; Henryk Rybinski; Alexandre Delteil;
展开▼
作者单位

ICS, Warsaw University of Technology;

France Telecome R D;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
multiword terms; compound nouns; proper nouns; frequent word sequences; frequent text patterns; text mining;

机译：多词术语;复合名词;专有名词;常用词序;常用文本模式;文本挖掘;

相似文献

外文文献
中文文献
专利

1. On the dynamics of the compounding of Japanese kanji with common and proper nouns* [J] . Katsuo Tamaoka, Peter Meyer, Shogo Makioka, Journal of Quantitative Linguistics . 2008,第2期

机译：关于日语汉字与普通名词和专有名词复合的动态*
2. Lexical and buffer effects in reading and in writing Noun-Noun compound nouns [J] . MondiniS., ArcaraG., SemenzaC. Behavioural neurology . 2012,第3期

机译：名词和名词复合名词的读写中的词汇和缓冲效应
3. Lexical and Buffer Effects in Reading and in Writing Noun-Noun Compound Nouns [J] . SaraMondini, GiorgioArcara, CarloSemenza Behavioural neurology . 2012,第3期

机译：名词和名词复合名词的读写中的词汇和缓冲效应
4. Discovering Compound and Proper Nouns [C] . Grzegorz Protaziuk, Marzena Kryszkiewicz, Henryk Rybinski, International Conference on Rough Sets and Intelligent Systems Paradigms(RSEISP 2007) . 2007

机译：发现复合和适当的名词
5. Microcognitive analysis of noun-noun compounds in a present-day English lexicon. [D] . Rubio Cuenca, Francisco. 2004

机译：当今英语词典中名词名词化合物的微认知分析。
6. Lexical and Buffer Effects in Reading and in Writing Noun-Noun Compound Nouns [O] . Sara Mondini, Giorgio Arcara, Carlo Semenza 2012

机译：名词和名词复合名词的读写中的词汇和缓冲效应
7. On the dynamics of the compounding of Japanese kanji with common and proper nouns [O] . Tamaoka Katsuo, Meyer Peter, Makioka Shogo, 2015

机译：日语汉字与普通名词和专有名词复合的动力学

Discovering Compound and Proper Nouns

摘要

著录项

相似文献

相关主题

期刊订阅