Nontraditional authorship attribution studies are those studies that make use of the computer, statistics, and stylistics to identify the author(s) of an anonymous text. The literature leaves no doubt that there are major problems in this field-problems that must be addressed if there is to be general acceptance of the results not only by other practitioners of stylometry, but also by the general public. Cherry picldng is one of these major problems and is the topic of this paper. One definition of cherry picldng is deliberately picldng out the data or scientific studies that support your view, while ignoring the data or studies that oppose your view. (Norton, p.l) The first time I heard the term cherry picking in the context of nontraditional authorship attribution studies was at the 1995 Classification Society of North America (CSNA) meeting in Denver. A member of that audience stated that Ward Elliott "cherry picked" the style markers (i.e. quantifiable elements of style such as word length distributions or function word ratios) and statistical tests for his Shakespeare attribution study. I had been aware of the concept of selecting only favorable style markers to guarantee a pre-con-ceived statistical result, but I never really thought of it as cherry picldng.
展开▼