首页> 外文会议>International Conference on Artificial Intelligence and Data Processing >Generation of Original Text with Text Mining and Deep Learning Methods for Turkish and Other Languages
【24h】

Generation of Original Text with Text Mining and Deep Learning Methods for Turkish and Other Languages

机译:通过土耳其和其他语言的文本挖掘和深度学习方法生成原始文本

获取原文

摘要

The amount of content on the web has increased dramatically since the Internet began providing users with the ability to produce content. Initial work on original text production has aimed at publishing the given data by putting in a certain mold. The most obvious example of this is the analysis reports on sporting events. However, preparing an original text compiled with general information about a subject has become a subject of interest to scientists as well. Although Neural Networks and Markov models were used previously for original text production, the original text generation process and comparison of the success rates weren't done using the Turkish language and the academic publication data repository dataset. In this study, it was tried to create summary information/original content about a specific subject by using Wikipedia TR for the Turkish language and the data pool created with hundreds of thousands of academic publications. In the study, texts were produced with Markov Model and LSTM, which were previously proposed, and the results are comparatively shared in detail. In the evaluation study, the performance of the proposed method was examined, and the correctness of the techniques was evaluated concerning syntactic accuracy and semantic preservation. The results are evaluated by presenting a mixture of original and machine-generated texts to the actual user for the success test of the proposed method. The success rate of the results is calculated with accuracy, recall, and f-measure. The results are very promising because it has been observed that the method can produce accurate and quality representations.
机译:自从Internet开始为用户提供产生内容的能力以来,Web上的内容量已急剧增加。原始文本制作的初步工作旨在通过放入特定模型来发布给定数据。最明显的例子是关于体育赛事的分析报告。然而,准备用有关该主题的一般信息汇编的原始文本也已成为科学家感兴趣的主题。尽管以前曾使用神经网络和马尔可夫模型来制作原始文本,但是并没有使用土耳其语语言和学术出版物数据存储集数据集来完成原始文本生成过程和成功率的比较。在这项研究中,尝试通过使用土耳其语的Wikipedia TR和由成千上万的学术出版物创建的数据库来创建有关特定主题的摘要信息/原始内容。在这项研究中,使用先前提出的马尔可夫模型和LSTM编写了文本,并比较详细地共享了结果。在评估研究中,检查了所提方法的性能,并评估了该技术在句法准确性和语义保留方面的正确性。通过将原始文本和机器生成的文本的混合物呈现给实际用户来评估所提出方法的成功性,从而对结果进行评估。结果的成功率是通过准确性,召回率和f度量来计算的。结果是非常有希望的,因为已经观察到该方法可以产生准确和高质量的表示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号