首页>
外国专利>
COMPUTER-PROGRAM PRODUCTS AND METHODS FOR ANNOTATING AMBIGUOUS TERMS OF ELECTRONIC TEXT DOCUMENTS
COMPUTER-PROGRAM PRODUCTS AND METHODS FOR ANNOTATING AMBIGUOUS TERMS OF ELECTRONIC TEXT DOCUMENTS
展开▼
机译:注释电子文本文档歧义条款的计算机程序产品和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Computer-program products and methods for automatically annotating terms, such as ambiguous terms, in an electronic text document are disclosed. In one embodiment, a method of annotating a text document includes determining, by a computing device, a term of interest within the text document. The method further includes searching a data structure including incongruous term pairs (tx, tt) determined from a controlled vocabulary for the term of interest appearing as a term tt, wherein the term tt is a linguistic head of a term tx of the incongruous term pairs (tx, tt). The method further includes annotating the term of interest with a meaning provided by the controlled vocabulary only if a term tx of the incongruous term pairs (tx, tt) associated with the term of interest in the data structure is not present within a predetermined textual distance of the term of interest in the text document.
展开▼
机译:公开了用于自动注释电子文本文档中的术语(例如,歧义术语)的计算机程序产品和方法。在一个实施例中,一种注释文本文档的方法包括由计算设备确定文本文档内的关注项。该方法还包括搜索数据结构,该数据结构包括从受控词汇表中确定的不相容的术语对(t x Sub>,tt),以寻找出现为术语tt的感兴趣术语,其中术语tt是的语言头。不相容项对中的项t x Sub>(t x Sub>,tt)。该方法还包括仅在与不相关的词语对(t x Sub>,tt)中的词语t x Sub>与之相关联时,才使用受控词汇表提供的含义来注释感兴趣的词语。数据结构中的关注项不在文本文档中的关注项的预定文本距离内。
展开▼