首页>
外国专利>
REAL-TIME LOSSLESS COMPRESSION METHOD OF BINARY DATA ENCODED IN GENERAL UTF-8 TYPE
REAL-TIME LOSSLESS COMPRESSION METHOD OF BINARY DATA ENCODED IN GENERAL UTF-8 TYPE
展开▼
机译:通用UTF-8类型的二进制数据实时无损压缩方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
In the present invention, provided is a universal compression method regarding to a UTF-8 encoded text. A UTF-8 code is invented by Ken Thompson and Rob Pike, wherein a UTF-8 is one of variable length character encoding schemes for Unicode. The UTF-8 is an abbreviation of universal coded character set + transformation format 8-bit, and is originally proposed with a name of a file system safe UCS/Unicode transformation format (FSS-UTF). The UTF-8 encoding is used with 1 to 4 bytes in order to represent one Unicode character. The UTF-8 is defined by other methods in various standard documents, but a general structure thereof is the same. Bits indicating a Unicode code point are divided into several parts to be included in lower bits of the bites represented by the UTF-8. The character up to U+007F are displayed in the same manner as 7 bits ASCII characters, and the subsequent characters are displayed by a bit pattern up to 4 bytes as follows. The most significant bit of all bytes is 1 not to be confused with the 7 bit ASCII characters. As a result, a high compression efficiency is exhibited in the case of a country in which a native language takes overwhelmingly great importance in communications such as Korea, Japan, China, etc., which are non-English-speaking countries in a multi-lingual system, and compression is not performed even in an English-speaking country so data dose not increase.;COPYRIGHT KIPO 2018
展开▼