This article analyses the influence of various combinations of mixed-level stylometric characteristics on the quality of verification of the authorship of Russian, English and French prose texts. The study is carried out both for low-level stylometric characteristics based on words and characters, and for higher-level structure ones. All stylometric characteristics are calculated automatically using the ProseRhythmDetector program. This approach provides the analyses of works of a large volume and many writers at the same time. In the course of the work, character-level, word-level, and structure-level stylometric vectors are associated with each text. During the experiments, the sets of parameters of these three levels were combined with each other in all possible ways. The resulting vectors of stylometric characteristics were submitted to the input of various classifiers to perform verification and identify the most suitable classifier for solving the problem. The best results were obtained using the AdaBoost classifier. The average F-measure for all languages was over 92. Detailed verification quality assessments are given for each author and analyzed. The use of high-level stylometric characteristics, in particular, the frequency of using N-grams of POS tags, opens the prospect of a more detailed analysis of author's styles. The results of the experiments show that when combining the characteristics of the structure level with the characteristics of the word level and/or character level, the most accurate results of authorship verification for literary texts in Russian, English, and French are obtained. Additionally, the authors concluded that stylometric characteristics have different degrees of influence on the quality of authorship verification for different languages.
展开▼