首页>
外国专利>
Unsupervised document clustering using latent semantic density analysis
Unsupervised document clustering using latent semantic density analysis
展开▼
机译:使用潜在语义密度分析的无监督文档聚类
展开▼
页面导航
摘要
著录项
相似文献
摘要
According to one embodiment, a latent semantic mapping (LSM) space is generated from a collection of a plurality of documents, where the LSM space includes a plurality of document vectors, each representing one of the documents in the collection. For each of the document vectors considered as a centroid document vector, a group of document vectors is identified in the LSM space that are within a predetermined hypersphere diameter from the centroid document vector. As a result, multiple groups of document vectors are formed. The predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space. Thereafter, a group from the plurality of groups is designated as a cluster of document vectors, where the designated group contains a maximum number of document vectors among the plurality of groups.
展开▼