首页> 外国专利> Chat extraction system, method, and program for extracting chat part from conversation

Chat extraction system, method, and program for extracting chat part from conversation

机译:用于从对话中提取聊天部分的聊天提取系统,方法和程序

摘要

A system and method extract off-topic parts from a conversation. The system includes a first corpus including documents of a plurality of fields; a second corpus including only documents of a field to which the conversation belongs; a determination means for determination as a lower limit subject word a word for which IDF value for the first corpus and IDF value for the second corpus are each below a first certain threshold value; a score calculation part for calculation as a score a TF-IDF value for each word included in the second corpus; a clipping part, for sequential cutting out of intervals from text data that are contents of the conversation; and an extraction part for extraction as an off-topic part an interval where average value of the score of words included in the clipped interval is larger than a second certain threshold value.
机译:一种系统和方法从对话中提取离题部分。该系统包括第一语料库,该第一语料库包括多个领域的文档。第二语料库,仅包括对话所属领域的文档;确定装置,用于将第一语料库的IDF值和第二语料库的IDF值分别低于第一特定阈值的词确定为下限主题词;分数计算部分,用于计算第二语料库中包括的每个单词的TF-IDF值作为分数;剪切部分,用于从作为对话内容的文本数据中顺序剪切出间隔;提取部,其用于将被剪裁的区间中包含的单词的得分的平均值大于第二一定阈值的区间提取为离题部。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号