【24h】

TDDC: Timely Disclosure Documents Corpus

机译:TDDC:及时披露文件语料库

获取原文

摘要

In this paper, we describe the details of the Timely Disclosure Documents Corpus (TDDC). TDDC was manually organized by aligning the sentences from past Japanese and English timely disclosure documents in PDF format published by companies listed on the Tokyo Stock Exchange. TDDC consists of approximately 1.4 million parallel sentences in Japanese and English. TDDC was used as the official dataset for the 6th Workshop on Asian Translation to encourage the advancement of machine translation.
机译:在本文中,我们描述了及时披露文档语料库(TDDC)的细节。通过在东京证券交易所上市公司发布的PDF格式的PDF格式中对齐句子来手动组织TDDC。 TDDC由日语和英语中的大约140万平方句组成。 TDDC被用作亚洲翻译第6次研讨会的官方数据集,以鼓励机器翻译的进步。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号