首页>
外国专利>
Constructing Comparable Corpora with Universal Similarity Measure
Constructing Comparable Corpora with Universal Similarity Measure
展开▼
机译:通用相似度构建可比语料库
展开▼
页面导航
摘要
著录项
相似文献
摘要
The invention describes a system and method for creating a comparable corpus by obtaining a set of source documents containing text, constructing language-independent semantic structures for at least one sentence of each of the texts in the source documents; determining universal similarity measures for groups of the source documents by comparing the constructed language-independent semantic structures of the texts in the source documents; identifying sets of similar documents based on the determined universal similarity measures for the groups of the source documents; and creating the comparable corpus based on the identified sets of similar documents.
展开▼