【24h】

An Intention-Topic Model Based on Verbs Clustering and Short Texts Topic Mining

机译:基于动词聚类和短文本主题挖掘的意向主题模型

获取原文
获取原文并翻译 | 示例

摘要

Microblog, Twitter, status messages, the classified information website and so on are experiencing explosive growth with the development of web2.0, people prefer to use short texts to express their intentions and activities. Yet, when people submit some requirements through short texts, they hope to get a feedback which can help them to solve their problems rather than relevant content. Sometimes people need corresponding intention rather than similar content. However, current researches cannot solve the problem well. In this paper, we propose an intentiontopic model: Verb-Biterm Topic Model(V-BTM), which aims at corresponding intention matching. Intention is expressed by verbs and topic is expressed by BTM. Intention is the action of people want to express and topic is the goal of the intention. The key of the model is that people tend to express their intention with verbs and tend to express the topic with non-verb. In this model, firstly, we distinguish intentions with the verb clustering with the help of word2vec which is a deep learning tool. Secondly, we mine the topic using Biterm Topic Model(BTM) on the data without verbs. We carry out experiments on real-world short text collections. The results demonstrate that our approach can get better verb clustering and mine more coherent topics. Furthermore, the new model can be the base of our future researches.
机译:随着web2.0的发展,微博,Twitter,状态消息,分类信息网站等都经历了爆炸性的增长,人们更喜欢用短文本来表达自己的意图和活动。但是,当人们通过短文本提交一些要求时,他们希望获得反馈,这些反馈可以帮助他们解决问题,而不是相关内容。有时人们需要相应的意图,而不是相似的内容。但是,目前的研究不能很好地解决这个问题。在本文中,我们提出了一个意图主题模型:动词-双项主题模型(V-BTM),旨在进行相应的意图匹配。用动词表达意图,用BTM表达话题。意图是人们想要表达的行动,主题是意图的目标。该模型的关键是人们倾向于用动词表达意图,而倾向于用非动词表达话题。在此模型中,首先,我们借助作为深度学习工具的word2vec借助动词聚类来区分意图。其次,我们使用Biterm主题模型(BTM)在没有动词的数据上挖掘主题。我们对现实世界的短文本集合进行实验。结果表明,我们的方法可以获得更好的动词聚类并挖掘更多连贯的主题。此外,新模型可以作为我们未来研究的基础。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号