We introduce EVALution 2.0, a simplified Mandarin dataset for the evaluation of Vector Space Models. We take a psycholinguistics-based methodology through the use of a verbal association task, which differs from previous datasets that use corpus and ontology to construct word relation pairs. Semantic neighbors were created for 100 target words and surprisingly, to which participants produced 1129 word relation pairs. In a separate agreement-rating task, only 62 pairs showed were rejected. The methodology has proven to be a way to expand the existing resources quickly while maintaining a high level of quality.
展开▼