Isarn Dharma word segmentation

机译：Isarn Dharma分词

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents Isarn Dhama word segmentation based on the Isarn Dharma writing system and dictionary. In this study, input text is segmented into sequences of Isarn Dharma Character Clusters (IDCCs). Each IDCC represents a group of inseparable Isarn Dharma characters based on the Isarn Dharma writing system. The sequence of IDCCs will be considered as input in order to look for the most suitable segmentation word from the dictionary using the IDCC longest matching algorithm. Grouping rules were then used to group adjacent remaining IDCCs that do not match an Isarn word in the dictionary. In order to evaluate the efficiency of the proposed technique, Isarn literature, Jataka, legend and Buddha foretell were used as the testing data to test the proposed system; comparing with longest matching and a hybrid of the IDCC longest matching. The experiment results showed that the F-measures are 80.15%, 85.06% and 86.07% for the longest matching, the IDCC longest matching algorithm, and the proposed method, respectively.

机译：本文介绍了基于Isarn Dharma书写系统和字典的Isarn Dhama分词。在这项研究中，将输入文本分割为Isarn佛法字符簇（IDCC）的序列。每个IDCC代表基于Isarn Dharma书写系统的一组不可分离的Isarn Dharma字符。 IDCC的序列将被视为输入，以便使用IDCC最长匹配算法从字典中查找最合适的分割词。然后使用分组规则将与字典中的Isarn单词不匹配的其余其余IDCC进行分组。为了评估所提出技术的效率，以Isarn文献，Jataka，Legend和Buddha foretell作为测试数据来检验所提出的系统。与最长匹配进行比较以及IDCC最长匹配的混合。实验结果表明，最长匹配，IDCC最长匹配算法和所提出的方法的F值分别为80.15％，85.06％和86.07％。

著录项

来源
《International Conference on Control, Automation and Information Sciences》|2013年|53-57|共5页
会议地点
作者
Somsap S.; Seresangtakul P.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Isarn Dharma Word Segmentation Using a Statistical Approach with Named Entity Recognition [J] . Somsap Sittichai, Seresangtakul Pusadee ACM transactions on Asian and low-resource language information processing . 2020,第2期

机译：ISARN DHARMA Word Seation使用统计方法指定实体识别
2. Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor [J] . Prabhakar C. J., Thontadari C. International journal of information retrieval research . 2019,第2期

机译：基于Co-HOG描述符的视觉词袋对手写文档的自由分割
3. Segmentation Free Word Spotting for Handwritten Documents Using Bag of Visual Words Based on Co-HOG Descriptor [J] . Prabhakar C. J., Thontadari C. International journal of information retrieval research . 2019,第2期

机译：使用基于CO-HOG描述符的袋子视觉单词的手写文件分割免费单词斑点
4. Isarn Dharma word segmentation [C] . Somsap S., Seresangtakul P. International Conference on Control, Automation and Information Sciences . 2013

机译：Isarn Dharma字分割
5. Word segmentation, word recognition, and word learning: A computational model of first language acquisition. [D] . Daland, Robert. 2009

机译：分词，单词识别和单词学习：母语习得的计算模型。
6. The Edge Factor in Early Word Segmentation: Utterance-Level Prosody Enables Word Form Extraction by 6-Month-Olds [O] . Elizabeth K. Johnson, Amanda Seidl, Michael D. Tyler -1

机译：早期分词中的边缘因素：话语水平的韵律使6个月大的孩子能够提取单词形式
7. Function words facilitate word segmentation 1 Running Head: FUNCTION WORDS FACILITATE WORD SEGMENTATION Segmentation of vowel-initial words is facilitated by function words [O] . Yun Jung Kim, Megha Sundara 2014

机译：功能词便于分词1运行头：功能词促进词语分词功能词促进元音首字词的分词

Isarn Dharma word segmentation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅