首页> 外国专利> Domain concept discovery and clustering using word embedding in dialogue design

Domain concept discovery and clustering using word embedding in dialogue design

机译：域概念发现和聚类使用词嵌入在对话设计中

页面导航

摘要
著录项
相似文献

摘要

A system and method performs automated domain concept discovery and clustering using word embeddings by receiving a set of documents for natural language processing for a domain, representing a plurality of entries in the set of documents as continuous vectors in a high dimensional continuous space, applying a clustering algorithm based on a mutual information optimization criterion to form a set of clusters, associating each entry of the plurality of entries with each cluster in the set of clusters through formalizing an evidence based model of each cluster given each entry, calculating a mutual information metric between each entry and each cluster using the evidence based model, and identifying a nominal center of each cluster by maximizing the mutual information.

机译：系统和方法通过接收用于域的自然语言处理的一组文档来执行自动域概念发现和群集，用于域的自然语言处理，将该组文档中的多个条目表示为高维连续空间中的连续向量，应用a基于互信息优化标准的聚类算法形成一组簇，通过在每个条目给出每个条目的每个群集的基于核心的基于捕获的基于证据，将多个条目的每个条目与每个群集相关联，计算相互信息度量每个条目和每个群集使用基于证据的模型，并通过最大化相互信息来识别每个群集的标称中心。

著录项

公开/公告号US11048870B2

专利类型
公开/公告日2021-06-29

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201715841703
发明设计人 RAIMO BAKIS;DAVID NAHAMOO;LAZAROS C. POLYMENAKOS;CHENG WU;JOHN ZAKOS;
展开▼

申请日2017-12-14
分类号G06F16;G06F40/205;G06F16/93;G06F16/903;G06F16/35;G06F40/10;G06F40/30;
国家 US
入库时间 2022-08-24 19:38:47

相似文献

专利
外文文献
中文文献