A Preliminary Study on Probabilistic Models for Chinese Abbreviations

机译：汉语缩略词概率模型的初步研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Chinese abbreviations are widely used inthe modern Chinese texts. They are aspecial form of unknown words, includingmany named entities. This results indifficulty for correct Chinese processing.In this study, the Chinese abbreviationproblem is regarded as an error recoveryproblem in which the suspect root wordsare the "errors" to be recovered from a setof candidates. Such a problem is mappedto an HMM-based generation model forboth abbreviation identification and rootword recovery, and is integrated as part ofa unified word segmentation model whenthe input extends to a complete sentence.Two major experiments are conducted totest the abbreviation models. In the firstexperiment, an attempt is made to guessthe abbreviations of the root words. Anaccuracy rate of 72% is observed. Incontrast, a second experiment isconducted to guess the root words fromabbreviations. Some submodels couldachieve as high as 51% accuracy with thesimple HMM-based model. Somequantitative observations against heuristicabbreviation knowledge about Chineseare also observed.

机译：中文缩写广泛用于现代中文文本。他们是一个未知词的特殊形式，包括许多命名实体。这导致正确中文处理的困难。在本研究中，中文缩写问题被视为错误恢复怀疑词根词的问题是从集合中恢复的“错误” 的候选人。这样的问题已经映射基于HMM的生成模型缩写标识和根单词恢复，并作为一部分集成一个统一的分词模型输入扩展为完整的句子。进行了两个主要实验测试缩写模型。在第一实验，尝试猜测词根的缩写。一个观察到准确率为72％。在相比之下，第二个实验是进行猜测词根缩写。一些子模型可以达到51％的精度简单的基于HMM的模型。一些反对启发式的定量观察关于中文的缩写知识也观察到。

著录项

来源
《;42nd Annual Meeting of the Association for Computational Linguistics》|2004年|p.1-8|共8页
会议地点
作者
Jing-Shin Chang; Yu-Tso Lai;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. Mining atomic Chinese abbreviations with a probabilistic single character recovery model [J] . Jing-Shin Chang, Wei-Lun Teng Language Resources and Evaluation . 2006,第3a4期

机译：使用概率单字符恢复模型挖掘原子中文缩写
2. A Preliminary Study on the Probabilistic Estimation of Deaths from Future Earthquakes in North China [J] . Fu Zhengxiang Earthquake Research in China . 1995,第4期

机译：华北未来地震致死概率估计的初步研究
3. A Preliminary Study on the Probabilistic Estimation of Deaths from Future Earthquakes in North China [J] . 中国地震研究：英文版 . 1995,第004期

机译：华北地震未来地震死亡概率估计初探
4. A Study of The Morpheme Meaning Variation of Word-formation Based on the New Abbreviations in Contemporary Chinese [C] . Qi Cao, Yuanyuan Ma International Conference on Management, Education and Information . 2019

机译：基于当代汉语新缩写的词组形态学意义的研究
5. A Case Study of Hanban's Chinese Language Teaching Program at Western Kentucky University: Developmental History and Preliminary Outcomes [D] . Yu, Betty Sheng-Huei. 2017

机译：西肯塔基大学汉语汉语教学计划的案例研究：发展史和初步成果
6. A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time [O] . Y. Wu, J. C. Denny, S. T. Rosenbloom, 2015

机译：实时临床缩写词歧义化初步研究
7. Coastal Quaternary morphologies on the northern coast of the South China Sea, China, and their implications for current tectonic models: A review and preliminary study [O] . Pedoja, Kevin, Shen, Jian-Wei, Kershaw, Steve, 2008

机译：中国南海北部海岸沿海第四纪地貌及其对当前构造模型的意义：回顾与初步研究
8. Preliminary Structural Sensitivity Study of Hypersonic Inflatable Aerodynamic Decelerator Using Probabilistic Methods. [R] . Lyle, K. H. 2014

机译：基于概率方法的高超声速充气气动减速器初步结构灵敏度研究。

A Preliminary Study on Probabilistic Models for Chinese Abbreviations

摘要

著录项

相似文献

相关主题

期刊订阅