Extracting acronyms and their expansions from plain text is an important problem in text mining. Previous research shows that the problem can be solved via machine learning approaches. That is, converting the problem of acronym extraction to binary classification. We investigate the classification problem and find that the classes are highly unbalanced (the positive instances are very rare compared to negative ones). So we try to tackle the problem using an uneven margin classifier — SVM with Uneven Margins. Experimental results showed that our approach can get better results than baseline methods of using heuristic rules and conventional SVM models. Experimental results also showed how uneven margins classifier made the tradeoff between the precision and recall of extraction.
展开▼