首页>
外国专利>
TEXT MINING FOR AUTOMATICALLY DETERMINING SEMANTIC RELATEDNESS
TEXT MINING FOR AUTOMATICALLY DETERMINING SEMANTIC RELATEDNESS
展开▼
机译:用于自动确定语义相关性的文本挖掘
展开▼
页面导航
摘要
著录项
相似文献
摘要
Described herein is an approach for automatically determining the semantic relatedness of documents to semantic concepts. A first text mining analysis extracts a set of reference concepts from reference documents. A second text mining analysis extracts a set of test concepts from test documents that include a mixture of new concepts and reference concepts. An extended co-occurrence matrix is computed that indicates a frequency of co-occurrence (RCCF) of each new and each reference concept in the test documents with all other new and reference concepts. The extended co-occurrence matrix is used for computing a new concept relatedness score (NCRS) for the new concepts. A document similarity score (DSS) is computed for each of the test documents by aggregating, inter alia, the NCRS of each new concept with the RCCF of each reference concept. The DSS represents the semantic relatedness of the test document to the totality of the reference concepts.
展开▼