首页> 外文期刊>Archives >THE DATING OF UNDATED MEDIEVAL CHARTERS
【24h】

THE DATING OF UNDATED MEDIEVAL CHARTERS

机译:未定的中世纪章程的约会

获取原文
获取原文并翻译 | 示例
           

摘要

Approximately 95% of all English charters from the Conquest in 1066 to the beginning of the reign of Edward II in 1307 were issued without dates. One of the major objectives of the DEEDS Project (DEEDS, an acronym for Documents of Early England Data Set) at the University of Toronto has been to estimate dates of these undated documents through automation. This paper describes a World Wide Web user-interface toolkit to date the undated English charter, as well as the underlying two computationally intensive dating methodologies - the Maximum Prevalence and a distance based method. The Maximum Prevalence method, the more accurate of the two, relies on analyzing changes in the pattern of word and phrase usage as derived from a carefully selected collection containing thousands of dated documents electronically transcribed and stored in the DEEDS corpus. Over and above the dating of documents, the toolkit, which has features to visualize this pattern of change, is useful to historians, archivists and linguists alike. The distance- based method relies on computing the weighted sums of the dates of the documents in the DEEDS collection. The weights are determined on the basis of similarity between an undated document and the dated collection - the higher the similarity, the higher the weight; the reverse holds when the similarity is low. The performance of each of the dating methods is presented on a test set, where the average absolute errors for the Maximum Prevalence and the distance-based methods are found to be 7.6 and 12.5 years, respectively. A 'leave-one-out' cross-validation experiment performed on the more than 12,000 documents in the test set confirms the accuracy of the methodology. The strengths and weaknesses of each of the dating methods are discussed. In addition, a full description of the DEEDS corpus from England and continental Europe is provided, including the kinds of metadata that have been compiled from it.
机译:在1066年的征服中大约95%的英语章中,于1307年的Edward II统治的开始于没有日期。在多伦多大学的行动项目(行动,契约宣告)的主要目标之一是在多伦多大学的一直是通过自动化估算这些未定文件的日期。本文介绍了一个全球Web用户界面工具包,以显示未订婚的英语章程,以及潜在的两种计算密集型约会方法 - 最大流行和基于距离的方法。最大流行方法,两者的准确性更准确地依赖于分析单词和短语用法的变化,从仔细选择的集合中派生,其中包含以电子转录并存储在契约语料库中的数千个日期文件。在文件的约会之上,工具包具有可视化这种变化模式的功能,对历史学家,档案家和语言学家相似是有用的。基于距离的方法依赖于计算契约集合中文档日期的加权和。重量是基于未定文件和日期收集之间的相似性确定的 - 相似性越高,重量越高;当相似性低时,反向保持。每个约会方法的性能都在测试集上呈现,其中最大流行和基于距离的方法的平均绝对误差分别为7.6和12.5岁。在测试组中对超过12,000个文件进行的“休假次出局”交叉验证实验证实了方法的准确性。讨论了每个约会方法的强度和弱点。此外,提供了来自英格兰和大陆欧洲的契约语料库的完整描述,包括从中编制的组种。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号