首页> 外文会议>4th International Moratuwa Engineering Research Conference >Clustering Sinhala News Articles Using Corpus-Based Similarity Measures

【24h】

Clustering Sinhala News Articles Using Corpus-Based Similarity Measures

机译：使用基于语料库的相似性度量将僧伽罗语新闻文章聚类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

News aggregators help readers to handle large numbers of news items in a convenient manner by collecting them into a single place with meaningful groupings. Such news aggregators/clusters are available for English and some other popular languages. However, no such tools are available for Sinhala language. To address this void, this paper presents a system to collect news articles published across the web and group related articles using corpus-based similarity measures. Despite the simplicity of the technique and morphological richness of Sinhala, we achieved very promising results that prove the viability of the presented technique.

机译：新闻聚合器可以将读者收集到具有有意义分组的单个位置，从而以方便的方式帮助读者处理大量新闻。此类新闻聚合器/群集可用于英语和其他一些流行语言。但是，尚无用于僧伽罗语的此类工具。为了解决这个空白，本文提出了一种系统，该系统使用基于语料库的相似性度量来收集在网络上发布的新闻文章以及与组相关的文章。尽管Sinhala技术简单易行且形态丰富，但我们还是取得了非常有希望的结果，证明了所提出技术的可行性。

著录项

来源
《4th International Moratuwa Engineering Research Conference 》|2018年|437-442|共6页
会议地点 Moratuwa(LK)
作者
Purnima Nanayakkara; Surangika Ranathunga;
展开▼
作者单位

Department of Computer Science and Engineering, University of Moratuwa, Katubedda, 10400;

Department of Computer Science and Engineering, University of Moratuwa, Katubedda, 10400;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering algorithms; Weight measurement; Partitioning algorithms; Standards; Filtering; Heuristic algorithms; Semantics;

机译：聚类算法；权重测量；分区算法；标准；过滤；启发式算法；语义；;

相似文献

外文文献
中文文献
专利

1. Clustering news articles using efficient similarity measure and N-grams [J] . Desmond Bala Bisandu, Rajesh Prasad, Musa Muhammad Liman International Journal of Knowledge Engineering and Data Mining . 2018 ,第4期

机译：使用有效的相似性度量和N-gram对新闻文章进行聚类
2. Utilizing phrase-similarity measures for detecting and clustering informative RSS news articles [J] . Maria Soledad Pera, Yiu-Kai Ng Integrated Computer-Aided Engineering . 2008 ,第4期

机译：利用短语相似性度量来检测和聚类内容丰富的RSS新闻文章
3. Using maximal spanning trees and word similarity to generate hierarchical clusters of non-redundant RSS news articles [J] . Maria Soledad Pera, Yiu-Kai Dennis Ng Journal of Intelligent Information Systems . 2012 ,第2期

机译：使用最大生成树和单词相似度来生成非冗余RSS新闻文章的层次聚类
4. Clustering Sinhala News Articles Using Corpus-Based Similarity Measures [C] . Purnima Nanayakkara, Surangika Ranathunga International Moratuwa Engineering Research Conference . 2018

机译：使用基于语料库的相似措施聚类Sinhala新闻文章
5. A comparison of clustering procedures and similarity measures in creating clusters using warp functions. [D] . Elguindi, Anne Charlotte. 2010

机译：使用warp函数创建聚类时聚类过程和相似性度量的比较。
6. Stance markers in English medical research articles and newspaper opinion columns: A comparative corpus-based study [O] . Qian Shen, Yating Tao 2021

机译：英语医学研究文章和报纸舆论专栏的立场标记：基于比较的语料库研究
7. Utilizing Phrase-Similarity Measures for Detecting and Clustering Informative RSS News Articles [O] . Maria Soledad Pera, Yiu-kai Ng 2010

机译：利用短语相似性度量来检测和聚类翔实的RSS新闻文章
8. A NEW MEASURE OF BIOTIC SIMILARITY BETWEEN SAMPLES AND ITS APPLICATIONS WITH A CLUSTER ANALYSIS PROGRAM [R] . Carlos F. A. Pinkham 1974

机译：利用聚类分析程序测量样品间的生物相似性及其应用的新方法

Clustering Sinhala News Articles Using Corpus-Based Similarity Measures

摘要

著录项

相似文献

相关主题

期刊订阅