In recent years, conditional random fields (CRFs) have shown good performance in named entity recognition tasks. However, a direct application of it to biomedical named entity recognition incurs a very high training cost. In this paper, we evaluate two alternatives to training a CRF with a traditional single-phase maximum likelihood training method. One is to use an online training method and the other is to divide the named entity recognition task into two tasks. For the cascaded method, we propose to include a "margin" in the model that leads to better recognition results. Both methods give better performance with substantial decrease in training time. In particular, the cascaded method outperforms the best system in the JNLPBA shared task.
展开▼