首页> 外国专利> DOMAIN CONCEPT DISCOVERY AND CLUSTERING USING WORD EMBEDDING IN DIALOGUE DESIGN

DOMAIN CONCEPT DISCOVERY AND CLUSTERING USING WORD EMBEDDING IN DIALOGUE DESIGN

机译:在对话设计中使用词嵌入来进行域概念的发现和聚类

摘要

A system and method performs automated domain concept discovery and clustering using word embeddings by receiving a set of documents for natural language processing for a domain, representing a plurality of entries in the set of documents as continuous vectors in a high dimensional continuous space, applying a clustering algorithm based on a mutual information optimization criterion to form a set of clusters, associating each entry of the plurality of entries with each cluster in the set of clusters through formalizing an evidence based model of each cluster given each entry, calculating a mutual information metric between each entry and each cluster using the evidence based model, and identifying a nominal center of each cluster by maximizing the mutual information.
机译:一种系统和方法通过接收用于域的自然语言处理的一组文档来执行自动域概念发现和聚类,所述一组文档用于域的自然语言处理,将文档集中的多个条目表示为高维连续空间中的连续向量,并应用基于互信息优化准则的聚类算法,以形成一组聚类,通过形式化给定每个条目的每个聚类的基于证据的模型,将多个条目中的每个条目与该聚类中的每个聚类相关联,计算互信息度量使用基于证据的模型在每个条目和每个聚类之间进行分类,并通过最大化互信息来标识每个聚类的名义中心。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号