Never Abandon Minorities: Exhaustive Extraction of Bursty Phrases on Microblogs Using Set Cover Problem

机译：永不遗弃少数群体：使用设置封面问题穷举微博上的突发短语

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a language-independent data-driven method to exhaustively extract bursty phrases of arbitrary forms (e.g., phrases other than simple noun phrases) from microblogs. The burst (i.e., the rapid increase of the occurrence) of a phrase causes the burst of overlapping N-grams including incomplete ones. In other words, bursty incomplete N-grams inevitably overlap bursty phrases. Thus, the proposed method performs the extraction of bursty phrases as the set cover problem in which all bursty N-grams are covered by a minimum set of bursty phrases. Experimental results using Japanese Twitter data showed that the proposed method outperformed word-based, noun phrase-based, and segmentation-based methods both in terms of accuracy and coverage.

机译：我们提出了一种独立于语言的数据驱动方法，以从微博中详尽地提取任意形式的突发性短语（例如，除简单名词短语之外的短语）。短语的突发（即出现的迅速增加）会导致包括不完整的N-gram重叠的N-gram突发。换句话说，突发性不完整的N-gram不可避免地与突发性短语重叠。因此，所提出的方法执行突发短语的提取作为集合覆盖问题，其中所有突发N-gram被最小组的突发短语覆盖。使用日语Twitter数据进行的实验结果表明，该方法在准确性和覆盖率方面均优于基于单词，基于名词短语和基于分段的方法。

著录项

来源
《Conference on empirical methods in natural language processing》|2017年|2348-2357|共10页
会议地点
作者
Masumi Shirakawa; Takahiro Hara; Takuya Maekawa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Semi-automated set-up for exhaustive micro-electromembrane extractions of basic drugs from biological fluids [J] . Dvorak Milos, Seip Knut Fredrik, Pedersen-Bjergaard Stig, Analytica chimica acta . 2018,第期

机译：从生物流体的碱性药物萃取的半自动设置
2. Bursty event detection from microblog: a distributed and incremental approach [J] . Li Jianxin, Wen Jianfeng, Tai Zhenying, Concurrency and computation: practice and experience . 2016,第11期

机译：来自微博的突发事件检测：一种分布式和增量式方法
3. Topic Summarization of Microblog Document in Bahasa Indonesia using the Phrase Reinforcement Algorithm [J] . Meganingrum Arista Jiwanggi, Mirna Adriani Procedia Computer Science . 2016,第1期

机译：短语增强算法在印度尼西亚语中的微博文档主题汇总
4. Never Abandon Minorities: Exhaustive Extraction of Bursty Phrases on Microblogs Using Set Cover Problem [C] . Masumi Shirakawa, Takahiro Hara, Takuya Maekawa Conference on empirical methods in natural language processing . 2017

机译：永远不要放弃少数群体：使用集封面问题详尽提取微博上的微博短语
5. Patterns of land use and land cover change and its consequences for wildlife: Agricultural abandonment and brown bears (Ursas arctos) in eastern Europe. [D] . Alcantara Concepcion, Pedro Camilo. 2010

机译：土地利用和土地覆被的变化形式及其对野生生物的影响：东欧的农业废弃和棕熊（Ursas arctos）。
6. PubMed Phrases an open set of coherent phrases for searching biomedical literature [O] . Sun Kim, Lana Yeganova, Donald C. Comeau, 2018

机译：PubMed短语一组用于搜索生物医学文献的连贯短语
7. PHCM: A Particle Horizontal Cast Movement Based Model for Bursty Events Detection of Chinese Microblog [O] . Le Zhang, Xueqiang Lv, Leihan Zhang 2015

机译：PHCM：基于粒子水平铸造运动模型，用于中国微博的爆发事件检测
8. PKUICST at TREC 2014 Microblog Track: Feature Extraction for Effective Microblog Search and Adaptive Clustering Algorithms for TTG. [R] . Lv, C., Fan, F., Qiang, R., 2014

机译：2014年TREC上的pKUICsT微博跟踪：TTG的有效微博搜索和自适应聚类算法的特征提取。

Never Abandon Minorities: Exhaustive Extraction of Bursty Phrases on Microblogs Using Set Cover Problem

摘要

著录项

相似文献

相关主题

期刊订阅