Automating Transliteration of Cuneiform from Parallel Lines with Sparse Data

机译：使用稀疏数据从平行线自动楔形文字音译

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Cuneiform tablets appertain to the oldest textual artifacts and are in extent comparable to texts written in Latin or ancient Greek. The Cuneiform Commentaries Project (CPP) from Yale University provides tracings of cuneiform tablets with annotated transliterations and translations. As a part of our work analyzing cuneiform script computationally with 3D-acquisition and word-spotting, we present a first approach for automatized learning of transliterations of cuneiform tablets based on a corpus of parallel lines. These consist of manually drawn cuneiform characters and their transliteration into an alphanumeric code. Since the Cuneiform script is only available as raster-data, we segment lines with a projection profile, extract Histogram of oriented Gradients (HoG) features, detect outliers caused by tablet damage, and align those features with the transliteration. We apply methods from part-of-speech tagging to learn a correspondence between features and transliteration tokens. We evaluate point-wise classification with K-Nearest Neighbors (KNN) and a Support Vector Machine (SVM); sequence classification with a Hidden Markov Model (HMM) and a Structured Support Vector Machine (SVM-HMM). Analyzing our findings, we reach the conclusion that the sparsity of data, inconsistent labeling and the variety of tracing styles do currently not allow for fully automatized transliterations with the presented approach. However, the pursuit of automated learning of transliterations is of great relevance as manual annotation in larger quantities is not viable, given the few experts capable of transcribing cuneiform tablets.

机译：楔形文字板属于最古老的文字制品，在一定程度上可与拉丁文或古希腊文书写的文字相提并论。耶鲁大学的楔形文字评论计划（CPP）提供了带注释音译和翻译的楔形文字药片的描迹。作为我们使用3D采集和单词点算计算分析楔形文字脚本的工作的一部分，我们提出了一种基于平行线语料库自动学习楔形文字片音译的第一种方法。这些由手动绘制的楔形文字和它们的音译成字母数字代码组成。由于Cuneiform脚本仅可作为栅格数据使用，因此我们使用投影轮廓对线进行了分割，提取了定向梯度直方图（HoG）特征的直方图，检测了由于数位板损坏而导致的离群值，并将这些特征与音译对齐。我们使用词性标注中的方法来学习特征与音译标记之间的对应关系。我们使用K最近邻（KNN）和支持向量机（SVM）评估逐点分类;隐马尔可夫模型（HMM）和结构化支持向量机（SVM-HMM）进行序列分类。分析我们的发现，我们得出的结论是，数据稀疏，标签不一致以及跟踪样式多种多样，目前尚无法使用所提出的方法实现全自动音译。但是，由于很少有能够抄录楔形文字片的专家，因此进行自动音译学习非常重要，因为要进行大量手动注释是不可行的。

著录项

来源
《IAPR International Conference on Document Analysis and Recognition》|2017年|615-620|共6页
会议地点
作者
Bartosz Bogacz; Maximilian Klingmann; Hubert Mara;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Hidden Markov models; Feature extraction; Support vector machines; Task analysis; Image segmentation; Microsoft Windows; Histograms;

机译：隐马尔可夫模型;特征提取;支持向量机;任务分析;图像分割; Microsoft Windows;直方图;

相似文献

外文文献
中文文献
专利

1. The Parallel Image Processing Environment (PIPE): automated parallelization of satellite data analyses [J] . James J. Simpson, Timothy J. McIntire, Jared S. Berg, Concurrency and Computation . 2007,第1期

机译：并行图像处理环境（PIPE）：卫星数据分析的自动并行化
2. Basker: Parallel sparse LU factorization utilizing hierarchical parallelism and data layouts [J] . Booth Joshua D., Ellingwood Nathan D., Thornquist Heidi K., Parallel Computing . 2017,第octa期

机译：Basker：利用分层并行性和数据布局进行并行稀疏LU分解
3. The Method Of Parallel Recognition And Parallel Optimization Based On Data Dependence With Sparse Matrix [J] . Navid Bazrkar, Payam Porkar International Journal of Scientific & Technology Research . 2014,第7期

机译：基于数据依赖稀疏矩阵的并行识别和并行优化方法
4. Automating Transliteration of Cuneiform from Parallel Lines with Sparse Data [C] . Bartosz Bogacz, Maximilian Klingmann, Hubert Mara IAPR International Conference on Document Analysis and Recognition . 2017

机译：与稀疏数据的平行线自动化楔形文字的音译
5. Automatic Sparse Computation Parallelization by Utilizing Domain-Specific Knowledge in Data Dependence Analysis [D] . Soltan Mohammadi, Mahdi. 2020

机译：通过利用数据依赖性分析中的域特定知识来自动稀疏计算并行化
6. EST2uni: an open parallel tool for automated EST analysis and database creation with a data mining web interface and microarray expression data integration [O] . Javier Forment, Francisco Gilabert, Antonio Robles, 2008

机译：EST2uni：开放式并行工具用于自动化EST分析和数据库创建具有数据挖掘Web界面和微阵列表达数据集成
7. Automating Wavefront Parallelization for Sparse Matrix Computations [O] . Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, 2016

机译：自动化波前并行化，用于稀疏矩阵计算
8. Exploiting Data Sparsity in Parallel Matrix Powers Computations. [R] . Knight, N., Carson, E., Demmel, J. 2013

机译：利用并行矩阵中的数据稀疏性为计算提供动力。

Automating Transliteration of Cuneiform from Parallel Lines with Sparse Data

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅