Interpolating Between Types and Tokens by Estimating Power-Law Generators

机译：通过估计幂律发生器在类型和标记之间进行插值

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-law distribution in the frequencies of word tokens. We present a framework for developing statistical models that generically produce power-laws, augmenting standard generative models with an adaptor that produces the appropriate pattern of token frequencies. We show that taking a particular stochastic process - the Pitman-Yor process - as an adaptor justifies the appearance of type frequencies in formal analyses of natural language, and improves the performance of a model for unsupervised learning of morphology.

机译：语言的标准统计模型无法捕获自然语言最显着的特性之一：单词标记频率中的幂律分布。我们提供了一个开发统计模型的框架，该模型通常会产生幂律，并使用产生适当标记频率模式的适配器来扩展标准生成模型。我们表明，采用特定的随机过程-Pitman-Yor过程-作为适配器，可以证明对形式频率的出现在自然语言的形式分析中是合理的，并且可以提高无监督形态学模型的性能。

著录项

来源
《Annual Conference on Neural Information Processing Systems(NIPS); 20051205-10; British Columbia(CA)》|2005年|P.459-466|共8页
会议地点 British Columbia(CA)
作者
Sharon Goldwater; Thomas L. Griffiths; Mark Johnson;
展开▼
作者单位

Department of Cognitive and Linguistic Sciences Brown University, Providence RI 02912, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Analysis of Distributed Token Circulation Algorithm with Faulty Random Number Generator [J] . Shinji Kawai, Fukuhito Ooshita, Hirotsugu Kakugawa, Parallel Processing Letters . 2014,第1期

机译：故障随机数发生器的分布式令牌流通算法分析
2. A DLL-Based Quadrature Clock Generator With a 3-Stage Quad Delay Unit Using the Sub-Range Phase Interpolator for Low-Jitter and High-Phase Accuracy DRAM Applications [J] . Youngbog Yoon, Hyunsu Park, Chulwoo Kim Circuits and Systems II: Express Briefs, IEEE Transactions on . 2020,第11期

机译：一种基于DLL的正交时钟发生器，具有3级四延迟单元，用于低抖动和高阶段精度DRAM应用的子距离相位插值
3. A Distortion Shaping Technique to Equalize Intermodulation Distortion Performance of Interpolating Arbitrary Waveform Generators in Automated Test Equipment [J] . Peter Sarson, Tomonori Yanagida, Shohei Shibuya, Journal of Electronic Testing: Theory and Applications: Theory and Applications . 2018,第3期

机译：一种失真整形技术，均衡在自动测试设备中互连任意波形发生器的互调失真性能
4. Interpolating Between Types and Tokens by Estimating Power-Law Generators [C] . Sharon Goldwater, Thomas L. Griffiths, Mark Johnson Annual Conference on Neural Information Processing Systems . 2006

机译：通过估计幂律发生器在类型和令牌之间插入
5. Design considerations and estimated onvehicle performance for a compressioncouple based thermoelectric generator. [D] . Boroujeni, Nariman Mansouri. 2015

机译：基于压缩对的热电发电机的设计考虑因素和估计的车辆性能。
6. Evaluating Linearly Interpolated Intercensal Estimates of Demographic and Socioeconomic Characteristics of U.S. Counties and Census Tracts 2001–2009 [O] . Margaret M. Weden, Christine E. Peterson, Jeremy N. Miles, -1

机译：评估2001-2009年美国县和人口普查区的人口统计学和社会经济特征线性插值抽样估计
7. Power-laws and the Conservation of Information in discrete token systems: Part 2 The role of defect [O] . Hatton, Les 2012

机译：幂律与离散令牌中的信息守恒系统：第2部分缺陷的作用

Interpolating Between Types and Tokens by Estimating Power-Law Generators

摘要

著录项

相似文献

相关主题

期刊订阅