Non-parametric Bayesian Segmentation of Japanese Noun Phrases

机译：日本名词短语的非参数贝叶斯分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A key factor of high quality word segmentation for Japanese is a high-coverage dictionary, but it is costly to manually build such a lexical resource. Although external lexical resources for human readers are potentially good knowledge sources, they have not been utilized due to differences in segmentation criteria. To supplement a morphological dictionary with these resources, we propose a new task of Japanese noun phrase segmentation. We apply non-parametric Bayesian language models to segment each noun phrase in these resources according to the statistical behavior of its supposed constituents in text. For inference, we propose a novel block sampling procedure named hybrid type-based sampling, which has the ability to directly escape a local optimum that is not too distant from the global optimum. Experiments show that the proposed method efficiently corrects the initial segmentation given by a morphological analyzer.

机译：日语高质量词分割的关键因素是一个高覆盖字典，但手动构建这种词汇资源是昂贵的。尽管人类读者的外部词汇资源是潜在的知识来源，但由于分割标准的差异，它们尚未使用它们。为了补充与这些资源的形态词典，我们提出了日本名词短语分割的新任务。我们应用非参数贝叶斯语言模型将这些资源中的每个名词短语分段为文本中所假设的成分的统计行为。出于推理，我们提出了一种名为基于混合类型的采样的新颖的块采样过程，其能够直接逃避从全局最佳的局部最佳的最佳终止。实验表明，该方法有效地校正了形态分析仪给出的初始分割。

著录项

来源
《Conference on empirical methods in natural language processing》|2011年||共11页
会议地点
作者
Yugo Murawaki; Sadao Kurohashi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. N'-ellipsis and the structure of noun phrases in Chinese and Japanese [J] . Mamoru Saito T.-H. Jonah Lin and Keiko Murasugi Journal of East Asian Linguistics . 2008,第03期

机译：N'-省略号与汉语和日语名词短语的结构
2. An MEG Study of Temporal Characteristics of Semantic Integration in Japanese Noun Phrases [J] . Hirohisa KIGUCHI, Nobuhiko ASAKURA IEICE Transactions on Information and Systems . 2008,第6期

机译：日语名词短语语义整合的时间特征的MEG研究
3. Extraction of candidates having semantic relations of Japanese noun phrases "NP{sub}1 'no' NP{sub}2" by dependency structures [J] . Shosaku Tanaka, Yoichi Tomiura, Toru Hitaka 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2001,第351期

机译：通过依赖结构提取日语名词短语“ NP {sub} 1'no'NP {sub} 2”具有语义关系的候选词
4. Non-parametric Bayesian Segmentation of Japanese Noun Phrases [C] . Yugo Murawaki, Sadao Kurohashi Conference on empirical methods in natural language processing;EMNLP 2011 . 2012

机译：日语短语的非参数贝叶斯分割
5. INTERNAL NOUN PHRASE RELATIONS OF COMPLEX NOUN PHRASES AND THE CONCEPT OF "AGENCY." [D] . HIGGINS, SUSAN GAYLE. 1973

机译：复杂名词短语的内部名词短语关系与“代理”的概念。
6. Semantic distribution study of noun-noun compounds in the Japanese CT clinical reports [O] . Naoki Nishimoto, Satoshi Terae, Guoqian Jiang, 2006

机译：日本CT临床报告中名词名词化合物的语义分布研究
7. Type Compound Noun Phrases in Japanese [O] . 山森良枝 2005

机译：<复数不定名词短语+言语名词>日语中的复合名词短语

Non-parametric Bayesian Segmentation of Japanese Noun Phrases

摘要

著录项

相似文献

相关主题

期刊订阅