研究在维吾尔文字语料库建立过程中,从MS-DOS系统上排版的书刊、杂志中获得维吾尔语单词,并转换到Windows环境上RTF格式的一种快速解决方法,然后提出维吾尔文字Unicode代码对应的RTF代码表和动态生成维吾尔文RTF文件的简单方法.实践证明这种方法有助于提高语料库构造中的大量单词收集的效率和质量.%In this paper we mainly study the fast solution for constructing Uighur text corpus. In the process of construction, the Uyghur language words are captured from books and magazines typeset in MS-DOS system and then converted to RTF format in WINDOWS environment. Then we put forward the RTF code generation timers corresponding to Unicode code of Uygur characters and a simple method of dynamic generation of Uyghur RTF files. Practice proves that this kind of method helps the improvement of efficiency and quality of the collection of a great amount of words during the construction of corpus.
展开▼