首页>
外国专利>
Hybrid comparison for unicode text strings consisting primarily of ASCII characters
Hybrid comparison for unicode text strings consisting primarily of ASCII characters
展开▼
机译:主要由ASCII字符组成的unicode文本字符串的混合比较
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method compares text strings having Unicode encoding. The method receives a first string S=s1s2 . . . sn and a second string T=t1t2 . . . tm, where s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters. The method computes a first string weight for the first string S according to a weight function ƒ. When S consists of ASCII characters, ƒ(S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, ƒ(S)=g(s1)g(s2) . . . g(sn), where g(si)=si when si is an ASCII character and g(si)=s′i when si is an accented ASCII character that is replaceable by the corresponding ASCII character s′i. The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
展开▼
机译:一种方法比较具有Unicode编码的文本字符串。该方法接收第一个字符串S = s 1 Sub> s 2 Sub>。 。 。 s n Sub>和第二个字符串T = t 1 Sub> t 2 Sub>。 。 。 t m Sub>,其中s 1 Sub>,s 2 Sub>,。 。 。 ,s n Sub>和t 1 Sub>,t 2 Sub>,。 。 。 ,t m Sub>是Unicode字符。该方法根据权重函数ƒ计算第一弦S的第一弦权重。当S由ASCII字符组成时,ƒ(S)= S。当S由ASCII字符和一些可以用ASCII字符替换的带重音的ASCII字符组成时,ƒ(S)= g(s 1 Sub>)g(s 2 Sub>)。 。 。 g(s n Sub>),其中,当s i Sub>是ASCII字符时,g(s i Sub>)= s i Sub>和g(s i Sub>)= s' i Sub>,当s i Sub>是带重音的ASCII字符时,可以用相应的ASCII字符s'< Sub> i Sub>。该方法还计算第二文本字符串T的第二字符串权重。使用字符串权重来测试字符串的相等性。
展开▼