...
首页> 外文期刊>Information Technology >Using standoff properties for marking-up historical documents in the humanities
【24h】

Using standoff properties for marking-up historical documents in the humanities

机译:使用隔离属性在人文科学中标记历史文件

获取原文
获取原文并翻译 | 示例

摘要

Markup in the form of tags is often embedded into documents to describe formatting structures and other features, as in HTML on the Web. But in the humanities, the use of embedded markup for the transcription of historical documents leads to problems in the representation of overlapping features, and subjective variation in the use of different markup tags for the same features compromises interoperability of the transcriptions. "Standoff" techniques, in which the markup and the text it describes are stored separately, can help alleviate these problems. "Standoff properties" is a technique for recording textual properties that do not conform to a context-free grammar, and can freely overlap. This allows a divide-and-conquer approach to markup, whereby sets of markup properties can record different aspects of a text, which can then be recombined as needed. Despite these advantages, standoff techniques are usually considered impractical when both the underlying text and its markup are subject to change. To circumvent this problem, this paper describes a practical algorithm for updating a set of standoff markup properties separately from the text.
机译:标记形式的标记通常嵌入到文档中,以描述格式结构和其他功能,例如Web上的HTML。但是在人文科学中,将嵌入式标记用于历史文档的转录会导致重叠特征的表示出现问题,并且针对相同特征使用不同标记的主观变化会损害转录的互操作性。将标记和标记所描述的文本分开存储的“停滞”技术可以帮助减轻这些问题。 “分隔属性”是一种用于记录不符合上下文无关语法并且可以自由重叠的文本属性的技术。这允许采用分治法进行标记,从而使标记属性集可以记录文本的不同方面,然后可以根据需要重新组合文本。尽管有这些优点,但当基础文本及其标记都可能发生变化时,通常认为隔离技术不切实际。为了解决这个问题,本文描述了一种实用的算法,用于独立于文本来更新一组隔离标记属性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号