...
首页> 外文期刊>Procedia Computer Science >A Generative Adversarial Network for Data Augmentation: The Case of Arabic Regional Dialects
【24h】

A Generative Adversarial Network for Data Augmentation: The Case of Arabic Regional Dialects

机译:用于数据增强的生成对抗性网络:阿拉伯区域方言的案例

获取原文

摘要

Text Generation using Generative Adversarial Networks (GANs) has been successful in domains such as sentiment analysis using Sentimental GAN (SentiGAN) model. We adopt a similar model to generate sentences for five regional Arabic dialects (Egypt, Gulf, Maghreb, Levant, and Iraq). The objective is to overcome the scarcity of richly annotated Dialectal Arabic (DA) datasets by automatic generation of such corpora. The DA generation process for a specific dialect, relies on a generator to create new text, and a discriminator to evaluate that text, with a dynamic update that will allow the process to run automatically without supervision. Novelty and diversity are the two metrics used to verify the consistency and quality of the generated DA text before enriching the sought datasets. Experimental results confirm the reliability and value of the generated datasets when tested by different classifiers.
机译:使用生成对抗性网络(GANS)的文本生成在使用感伤GaN(Sentigan)模型的情感分析等领域中取得了成功。 我们采用类似的模型来生成五个区域阿拉伯语方言(埃及,海湾,Maghreb,Levant和伊拉克)的判决。 目的是通过自动生成这样的Corpora来克服丰富的辩证阿拉伯语(DA)数据集的稀缺性。 特定方言的DA生成过程依赖于生成器创建新文本,以及一个判别者来评估该文本的动态更新,该动态更新将允许进程自动运行而没有监控。 新颖性和多样性是在丰富寻求数据集之前验证生成的DA文本的一致性和质量的两个指标。 实验结果证实了不同分类器测试时生成的数据集的可靠性和值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号