Unsupervised Word Segmentation in Context

机译：上下文中的无监督分词

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper extends existing word segmentation models to take non-linguistic context into account. It improves the token F-score of a top performing segmentation models by 2.5% on a 27k utterances dataset. We posit that word segmentation is easier in-context because the learner is not trying to access irrelevant lexical items. We use topics from a Latent Dirichlet Allocation model as a proxy for "activities" contexts, to label the Providence corpus. We present Adaptor Grammar models that use these context labels, and we study their performance with and without context annotations at test time.

机译：本文扩展了现有的分词模型，以考虑非语言环境。在27k话语数据集上，它将性能最佳的细分模型的令牌F分数提高了2.5％。我们认为，分词在上下文中更容易，因为学习者不会尝试访问不相关的词汇项。我们使用潜在Dirichlet分配模型中的主题作为“活动”上下文的代理，以标记Providence语料库。我们介绍使用这些上下文标签的Adapter Grammar模型，并在测试时研究带有或不带有上下文注释的适配器的性能。

著录项

来源
《International conference on computational linguistics》|2014年|2326-2334|共9页
会议地点
作者
Gabriel Synnaeve; Isabelle Dautriche; Benjamin Boerschinger; Mark Johnson; Emmanuel Dupoux;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The Left and Right Context of a Word: Overlapping Chinese Syllable Word Segmentation with Minimal Context [J] . MIKE TIAN-JIAN JIANG, TSUNG-HSIEN LEE, WEN-LIAN HSU ACM transactions on Asian language information processing . 2013,第1期

机译：单词的左右上下文：具有最小上下文的中文音节分词重叠
2. Unsupervised Word Segmentation and Lexicon Discovery Using Acoustic Word Embeddings [J] . Herman Kamper, Aren Jansen, Sharon Goldwater Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第4期

机译：使用声词嵌入的无监督分词和词典发现
3. Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance [J] . Franz Kurfess Computing reviews . 2013,第2期

机译：使用上下文向量和句子重要性的无监督基于相似度的词义消歧
4. Unsupervised Word Segmentation in Context [C] . Gabriel Synnaeve, Isabelle Dautriche, Benjamin Boerschinger, International conference on computational linguistics . 2014

机译：上下文中无监督的词分割
5. Cross-Context Statistical Word Segmentation in Infancy [D] . Antovich, Dylan Matthew 2019

机译：婴儿期的跨上下文统计分词
6. Joint retinal layer and fluid segmentation in OCT scans of eyes with severe macular edema using unsupervised representation and auto-context [O] . Alessio Montuoro, Sebastian M. Waldstein, Bianca S. Gerendas, 2017

机译：使用无监督表示法和自动上下文在患有严重黄斑水肿的OCT扫描中进行联合视网膜层和液体分割
7. Modelling function words improves unsupervised word segmentation [O] . Mark Johnson, Anne Christophe, Katherine Demuth, 2015

机译：建模功能词改进了无监督的分词

Unsupervised Word Segmentation in Context

摘要

著录项

相似文献

相关主题

期刊订阅