Parallel Community Detection for Cross-Document Coreference

机译：交叉文档COSEREDS的平行社区检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a highly parallel solution for cross-document co reference resolution, which can deal with billions of documents that exist in the current web. At the core of our solution lies a novel algorithm for community detection in large scale graphs. We operate on graphs which we construct by representing documents' keywords as nodes and the colocation of those keywords in a document as edges. We then exploit the particular nature of such graphs where co referent words are topologically clustered and can be efficiently discovered by our community detection algorithm. The accuracy of our technique is considerably higher than that of the state of the art, while the convergence time is by far shorter. In particular, we increase the accuracy for a baseline dataset by more than 15% compared to the best reported result so far. Moreover, we outperform the best reported result for a dataset provided for the Word Sense Induction task in SemEval 2010.

机译：本文介绍了跨文档CO参考分辨率的高度平行解决方案，可以处理当前网站中存在的数十亿个文件。在我们的解决方案的核心下，在大规模图表中进行社区检测的新算法。我们通过代表文档的关键字作为节点以及文档中的这些关键字作为边缘的那些关键字的分配来操作。然后，我们利用了这样的图表的特殊性，其中CO指示词是拓扑聚类，可以通过我们的社区检测算法有效地发现。我们技术的准确性远高于现有技术的精度，而收敛时间则较短。特别是，与到目前为止的最佳报告结果相比，我们将基线数据集的准确性提高了15％以上。此外，我们优先于2010年Semeval 2010中提供的DataSet的最佳报告结果。

著录项

来源
《IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies》|2014年||共8页
会议地点
作者
Rahimian F.; Girdzijauskas S.; Haridi S.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
document handling; graph theory; natural language processing; SemEval 2010; cross-document coreference resolution; large scale graph; parallel community detection; word sense induction task; Accuracy; Clustering algorithms; Color; Communities; Context; Force; Measurement; community detection; coreference resolution; cross-document coreference; distributed algorithm;

机译：文档处理;图论;自然语言处理;2010年SEMEVAL COSEREFED分辨率;大规模图;并行社区检测;词感觉归纳任务;准确性;聚类算法;颜色;社区;迫使;Coreference分辨率;跨文档COSEREDER;分布式算法;

相似文献

外文文献
中文文献
专利

1. A systematic review and comparative analysis of cross-document coreference resolution methods and tools [J] . Beheshti Seyed-Mehdi-Reza, Benatallah Boualem, Venugopal Srikumar, Computing . 2017,第4期

机译：跨文档共引用解决方法和工具的系统审查和比较分析
2. Optimization and Parallelization of MRF Community Detection Algorithm for a Specific Network [J] . Jun Lu, Yuanzhong Zhang International Journal of Performability Engineering . 2019,第8期

机译：特定网络MRF群落检测算法的优化与并行化
3. Parallelizing SLPA for Scalable Overlapping Community Detection [J] . KonstantinKuzmin, MingmingChen, Boleslaw K.Szymanski Scientific programming . 2015,第4期

机译：并行SLPA用于可扩展的重叠社区检测
4. Parallel Community Detection for Cross-Document Coreference [C] . Rahimian F., Girdzijauskas S., Haridi S. IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies . 2014

机译：跨社区共引用的并行社区检测
5. Coreference, cross-document coreference, and information extraction methodologies. [D] . Bagga, Amit. 1998

机译：共指，跨文档共指和信息提取方法。
6. Highly parallelized droplet cultivation and prioritization of antibiotic producers from natural microbial communities [O] . Lisa Mahler, Sarah P Niehs, Karin Martin, 2021

机译：自然微生物群落的高度平行化液滴培养和抗生素生产商的优先级
7. Arabic Cross-Document Coreference Detection [O] . Asad Sayeed, Tamer Elsayed, Nikesh Garera, 2012

机译：阿拉伯语跨文档共指检测

Parallel Community Detection for Cross-Document Coreference

摘要

著录项

相似文献

相关主题

期刊订阅