首页> 外文期刊>Japanese journal of fuzzy theory and systems >A system of automatic keywords extraction and related data collection for Japanese text
【24h】

A system of automatic keywords extraction and related data collection for Japanese text

机译:日语文本关键词自动提取及相关数据收集系统

获取原文
获取原文并翻译 | 示例
       

摘要

Rapid progress of information services has been increasing the amount of text information which users recieve on Internet, teletext broadcasting and so on. A system extracting inportant part automatically is useful for users to get their interestinformation easily in such a case. Some methods in the field of natural language processing can extract important keywords or phrases from text information by using word dictionaries. however, these methods can hardiy deal with unexpected words such asnew proper nouns.On the other hand, the methods based on word occurrence are useful for dealing with unexpected words and they have high applicability for many kinds of text information because they use no dictionary. However, the results are insufficent as a summary insome cases.In this paper, we describe our newly developed KEIFIS (Keyword Extracting and Information Filtering System). KEIFIS extracts important keywords which represent major topics from a large amount of Japanese text information and collects related data with no dictionary.KEIFES has the following features:(1) It automatically extracts important keyowrds and combines some of them to give the users major topics.(2) It retrieves and collects information according to the topic specified by the users, and informs them of its arrival in real time.KEIFIS employs fuzzy information processing in computing the similarity between words. The similarity is also used to define a relationship between the specified topic and newly provided information.We applied KEIFIS to news programs on the teletextbroadcasting in Japanese, and made sure that KEIFIS was capable of extracting the important keywords which represent major topics appropriately.
机译:信息服务的飞速发展已经增加了用户在Internet,图文电视广播等上收到的文本信息的数量。在这种情况下,自动提取重要部分的系统对于用户轻松获得其兴趣信息很有用。自然语言处理领域中的某些方法可以通过使用单词词典从文本信息中提取重要的关键字或短语。但是,这些方法可以处理诸如新专有名词之类的非预期单词。另一方面,基于词出现的方法对于处理非预期单词非常有用,并且由于它们不使用字典,因此对多种文本信息具有很高的适用性。但是,在某些情况下,结果不足以作为总结。在本文中,我们描述了我们新开发的KEIFIS(关键字提取和信息过滤系统)。 KEIFIS会从大量的日语文本信息中提取代表主要主题的重要关键字,并且无需字典即可收集相关数据。KEIFES具有以下功能:(1)它会自动提取重要的keyowrd,并将其中的一些组合给用户以提供主要主题。 (2)根据用户指定的主题检索和收集信息,并实时通知他们到达。KEIFIS采用模糊信息处理来计算单词之间的相似度。相似性还用于定义指定主题与新提供的信息之间的关系。我们将KEIFIS应用于日语的电视广播电视新闻节目中,并确保KEIFIS能够适当地提取代表主要主题的重要关键字。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号