Keyphrase generation for Vietnamese administrative documents: a collaborative approach

机译：越南行政文件的关键正常生成：一种协作方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Keyphrases of a given document can be considered as its condensed summary. Unsupervised models focus on extracting keyphrases based only on the information contained in that document without interacting with other documents. While a good performance supervised learning model for keyphrase generation requires a massive effort to build training data, which can not generalize to new domains. Moreover, according to human perception, a user would comprehend the topic expressed in a document better if that user has already read other documents that express the same topic. Based on the above idea, we proposed a collaborative keyphrase generation system (CollabKG): a novel semi-supervised method by leveraging limited labeled data. The amount of labeled data will be enriched over time by the user. In our work, we conduct research on a large scale dataset consisting of 500,000 Vietnamese administrative documents. In CollabKG, each document is represented as a feature vector, and a cluster pruning algorithm is employed to accelerate finding the most similar documents. The generated keyphrases were manually evaluated for relevance and accuracy. In the final, the result we achieved shows high ratification. Therefore, we can conclude that CollabKG has good performance and fits a real-time system.

机译：可以将给定文件的关键短缺视为其浓缩摘要。无监督的模型专注于仅基于该文档中包含的信息，而不与其他文档进行交互。虽然关键字一代的良好性能监督学习模型需要大量努力来构建培训数据，但不能概括到新域。此外，根据人类的感知，如果该用户已经读取了表达相同主题的其他文档，则用户将更好地理解文档中表达的主题。基于上述思想，我们提出了一种协作关键的基础酶生成系统（Collabkg）：通过利用有限标记数据来实现新的半监督方法。用户将随时间富集标记数据的量。在我们的工作中，我们对由500,000名越南行政文件组成的大型数据集进行研究。在Collabkg中，每个文档被表示为特征向量，并且采用群集修剪算法来加速查找最相似的文档。手动评估生成的关键势以获取相关性和准确性。在决赛中，我们实现的结果显示出高估值。因此，我们可以得出结论，Collabkg具有良好的性能并适合实时系统。

著录项

来源
《International Conference on Knowledge and Systems Engineering》|2020年|43-48|共6页
会议地点
作者
Thi-Thu-Trang Nguyen; Thi-Hai-Yen Vuong; Van-Lien Tran; Le-Minh Nguyen; Xuan-Hieu Phan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Quality of service;

机译：服务质量;

相似文献

外文文献
中文文献
专利

1. Distributed collaborative Web document clustering using cluster keyphrase summaries [J] . Khaled Hammouda, Mohamed Kamel Information Fusion . 2008,第4期

机译：使用群集关键字摘要的分布式协作Web文档群集
2. An Efficient Approach for Keyphrase Extraction from English Document [J] . Imtiaz Hossain Emu, Asraf Uddin Ahmed, Manowarul Islam, International Journal of Intelligent Systems and Applications . 2017,第12期

机译：从英文文档中提取关键词的有效方法
3. A Keyphrase-Based Approach to Text Summarization for English and Bengali Documents [J] . Kamal Sarkar International journal of technology diffusion . 2014,第2期

机译：基于关键字的英语和孟加拉语文档文本摘要方法
4. CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction [C] . Xiaojun Wan, Jianguo Xiao 22nd International Conference on Computational Linguistics . 2008

机译：CollabRank：迈向单文档关键字提取的协作方法
5. A Cognition-Driven Approach To Modeling Document Generation and Learning Underlying Contexts From Documents [D] . Falahi, Misagh. 2017

机译：认知驱动的建模文档生成和从文档中学习底层上下文的方法
6. PITChing (professional organisations innovative trial designs and collaborative approach) for evidence generation for proton therapy [O] . Srinivas Chilukuri, Pankaj Kumar Panda, Rakesh Jalali 2020

机译：PITChing（专业组织创新的试验设计和协作方法）可为质子治疗生成证据
7. PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents [O] . Corina Florescu, Cornelia Caragea 2017

机译：PositionRank：从学术文件中对关键酶提取的无监督方法
8. Web-Based Collaborative Learning: An Assessment of a Question-Generation Approach [R] . Belanich, J. , Wisher, R. A. , Orvis, K. L. 2003

机译：基于网络的协作学习：对问题产生方法的评估

Keyphrase generation for Vietnamese administrative documents: a collaborative approach

摘要

著录项

相似文献

相关主题

期刊订阅