首页> 外文会议>Australasian joint conference on artificial intelligence >Understanding People Relationship: Analysis of Digitised Historical Newspaper Articles
【24h】

Understanding People Relationship: Analysis of Digitised Historical Newspaper Articles

机译:了解人际关系:数字化历史报纸文章分析

获取原文

摘要

The study of historical persons and their relationships gives an insight into the lives of people and the way society functioned in early times. Such information concerning Australian history can be gleaned from Trove's digitized collection of historical newspapers (1803-1954). This research aims to mine Trove's articles using closed and maximal association rules mining along with visualization tools to discover, conceptualize and understand the type, size and complexity of the notable relationships that existed between persons in historical Australia. Before the articles could be mined, they needed vigorous cleaning. Given the data's source, type and extraction methods, estimated word-error rates were at 50-75 %. Pre-processing efforts were aimed at reducing errors originating from optical character recognition (OCR), natural language processing and some co-referencing both within and between articles. Only after cleaning were the datasets able to return interesting associations at higher support thresholds.
机译:对历史人物及其关系的研究可以洞悉人们的生活以及早期社会的运作方式。可以从Trove的数字化历史报纸收藏(1803-1954)中收集有关澳大利亚历史的此类信息。这项研究旨在使用封闭和最大关联规则挖掘以及可视化工具来挖掘Trove的文章,以发现,概念化和理解澳大利亚历史悠久的人与人之间显着关系的类型,大小和复杂性。在开采这些物品之前,他们需要大力清洁。考虑到数据的来源,类型和提取方法,估计的单词错误率在50%到75%之间。预处理工作旨在减少由光学字符识别(OCR),自然语言处理以及文章内部和文章之间的一些共同引用引起的错误。只有在清理之后,数据集才能以更高的支持阈值返回有趣的关联。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号