首页> 外文会议>International Conference on Intelligent Computing and Control Systems >Depicting a Neural Model for Lemmatization and POS Tagging of Words from Palaeographic Stone Inscriptions
【24h】

Depicting a Neural Model for Lemmatization and POS Tagging of Words from Palaeographic Stone Inscriptions

机译:描绘了宫内铭文中的单词的lemmatization和POS标记的神经模型

获取原文

摘要

Lemmatization is essential before POS (Part-of-Speech) Tagging for analysis of morphology and the removal of inflections by returning the base of the word without the endings. POS is to indicate tagging the words into categories of grammatical terms in analysis of text and marking up linguistic words in a script. Considering the combinations and inflections in the words of Tamil language, there is difficulty in Lemmatization and POS Tagging classification and prediction of Tags of the words. As the automated tools are very rare for modern Tamil language there is a lack of such statistical methods and techniques for the Paleographic Tamil language such as the texts from inscriptions of stone where the words are combined, staked, overlapped and compounded without splitting up into morphemes or lemmas. The proposed work overcomes the complexity of splitting up and classifying ancient words. The proposed work is based on designing the Neural Model for POS Tag Classification and Prediction of Words from the Paleographic 11th century stone inscription script. Bi-LSTM model is implemented with the embedding layer of vectors of words for training the POS Tagging model and classifying the words into tags and prediction of Tags of words for any novel script given that involves syntactic tag assigning and predicting tag for concerning words efficiently. The proposed model provides 96.43% accuracy compared to the existing works in the stream.
机译:在POS(词性份额)标记之前,lemmatization是必不可少的用于分析形态的分析,并通过返回没有结尾的单词的基础来移除拐点。 POS是表示将单词标记为文本分析中的语法术语类别,并在脚本中标记语言单词。考虑到泰米尔语言的话语的组合和拐点,奇迹难以释放和POS标记分类和单词标签的预测。由于自动化工具对于现代泰米尔语言非常罕见,似乎古怪的泰米尔语言缺乏这种统计方法和技术,例如来自单词的铭文,绘制,重叠和复合而不分裂成语素的文字或lemmas。拟议的工作克服了分裂和分类古代词语的复杂性。拟议的工作是基于设计POS标签分类的神经模型和古地图11的单词预测 th 世纪石铭文剧本。 Bi-LSTM模型由嵌入的vetpore的单词的媒体矢量实现,用于训练POS标记模型,并将单词分类为标签,并对所提供的任何新颖脚本的单词标签预测,该脚本涉及涉及分配和预测标签的语法标签,以便有效地涉及单词。与现有流中的现有工作相比,拟议的型号提供96.43%的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号