The present invention discloses a Tibetan character constituent analysis method, a Tibetan sorting method and corresponding devices, and relates to the field of natural language processing. The present invention is proposed to solve the problem that the existing Tibetan sorting methods have no universality or compatibility, which is inconvenient for the use of automatic computer Tibetan sorting. The technical solution provided by the present invention includes: S10, acquiring a Tibetan text to be analyzed; S20, using Tibetan characters in the Tibetan text as the input of a preset finite state automaton group; and S30, acquiring the constituents of the Tibetan characters according to a target finite state automaton, when the target finite state automaton in the finite state automaton group determines that the Tibetan characters in the Tibetan text are correctly spelled.
展开▼
机译:本发明公开了一种藏文字符成分分析方法,藏文排序方法及相应装置,涉及自然语言处理领域。提出本发明以解决现有的藏文分拣方法没有通用性或兼容性的问题,这给自动计算机藏文分拣的使用带来不便。本发明提供的技术方案包括:S 10 B>,获取待分析的藏文。 S 20 B>,使用藏文中的藏文字符作为预设的有限状态自动机群的输入;和S 30 B>,当有限状态自动机组中的目标有限状态自动机确定藏文文本中的藏文字符正确拼写时,根据目标有限状态自动机获取藏文字符的成分。
展开▼