首页>
外国专利>
SYSTEM AND METHOD FOR HIERARCHICALLY ORGANIZING DOCUMENTS BASED ON DOCUMENT PORTIONS
SYSTEM AND METHOD FOR HIERARCHICALLY ORGANIZING DOCUMENTS BASED ON DOCUMENT PORTIONS
展开▼
机译:基于文档部分分层组织文档的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Embodiments as disclosed may generate an organizational hierarchy based on embeddings of portions of documents. Embeddings resulting from the embedding of the portions of the documents can be clustered using a hierarchical clustering mechanism to segment the portion space into a set of hierarchical clusters. Documents can be assigned to these clusters based on the presence of a portion of a document within a cluster. In this manner, the documents may themselves be clustered based on the clusters created from portions across the documents of the corpus. The clusters to which a document is assigned may also be ranked with respect to that document. Similarly, documents assigned to cluster can be ranked within the cluster to which they are assigned. Additionally, in certain embodiments, names or snippets for the clusters of the hierarchy may be derived from the portions comprising that cluster.
展开▼