首页>
外国专利>
Use of small unit language model for training large unit language models
Use of small unit language model for training large unit language models
展开▼
机译:使用小单元语言模型训练大单元语言模型
展开▼
页面导航
摘要
著录项
相似文献
摘要
A computer-implemented method, computer program product, and apparatus are provided. The method includes generating a plurality of sequences of small unit tokens from a first language model that is trained with a small unit corpus including the small unit tokens, the small unit corpus having been derived by tokenization with a small unit. The method further includes tokenizing the plurality of sequences of small unit tokens by a large unit that is larger than the small unit, to create a derived large unit corpus including derived large unit tokens.
展开▼