【24h】

Topic Feature Extraction of Chinese News Title

机译:中文新闻标题的主题特征提取

获取原文

摘要

To attract first attention at a glance, news titles are often short and contain important abstract information of web news. Topic feature extraction of web news title can greatly help news processing system to improve efficiency and accuracy before process whole news text. After segmentation and tagging, some words are wrongly truncated into discontinuous characters and phrases are into separate words as well. This paper proposes a topic feature extracting model from Chinese web news titles on phrase granularity. Titles are truncated into tagged key words before using frequent patterns to combine words into phrases, which are topic features. We conduct experimental studies on corpus of Chinese news titles between March 2011 and June 2011. The result showed that our topic extraction approach can yield quite reasonable topic feature phrases.
机译:为了一眼吸引第一眼的注意,新闻标题通常很短,并且包含Web新闻的重要摘要信息。网络新闻标题的主题特征提取可以极大地帮助新闻处理系统在处理整个新闻文本之前提高效率和准确性。在进行分段和标记后,某些单词会被错误地截断为不连续的字符,而短语也将被分解为单独的单词。提出了一种基于短语粒度的中文网络新闻标题主题特征提取模型。在使用频繁模式将单词组合为短语(这是主题功能)之前,标题会被截断为标记的关键词。我们在2011年3月至2011年6月之间对中文新闻标题语料库进行了实验研究。结果表明,我们的主题提取方法可以产生相当合理的主题特征短语。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号