首页> 外文会议>4th International Moratuwa Engineering Research Conference >Clustering Sinhala News Articles Using Corpus-Based Similarity Measures
【24h】

Clustering Sinhala News Articles Using Corpus-Based Similarity Measures

机译:使用基于语料库的相似性度量将僧伽罗语新闻文章聚类

获取原文
获取原文并翻译 | 示例

摘要

News aggregators help readers to handle large numbers of news items in a convenient manner by collecting them into a single place with meaningful groupings. Such news aggregators/clusters are available for English and some other popular languages. However, no such tools are available for Sinhala language. To address this void, this paper presents a system to collect news articles published across the web and group related articles using corpus-based similarity measures. Despite the simplicity of the technique and morphological richness of Sinhala, we achieved very promising results that prove the viability of the presented technique.
机译:新闻聚合器可以将读者收集到具有有意义分组的单个位置,从而以方便的方式帮助读者处理大量新闻。此类新闻聚合器/群集可用于英语和其他一些流行语言。但是,尚无用于僧伽罗语的此类工具。为了解决这个空白,本文提出了一种系统,该系统使用基于语料库的相似性度量来收集在网络上发布的新闻文章以及与组相关的文章。尽管Sinhala技术简单易行且形态丰富,但我们还是取得了非常有希望的结果,证明了所提出技术的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号