For sentiment classification, it is often recognized that embedding based on distributional hypothesis is weak in capturing sentiment contrast-contrasting words may have similar local context. Based on broader context, we propose to incorporate Theta Pure Dependence (TPD) into the Paragraph Vector method to reinforce topical and sentimental information. TPD has a theoretical guarantee that the word dependency is pure, i.e., the dependence pattern has the integral meaning whose underlying distribution can not be conditionally factorized. Our method outperforms the state-of-the-art performance on text classification tasks.
展开▼
机译:对于情感分类,通常会认识到,基于分布假设的嵌入在捕获情感对比词时可能很弱,可能具有相似的局部上下文。基于更广泛的上下文,我们建议将Theta Pure Dependency(TPD)合并到Paragraph Vector方法中,以增强主题和情感信息。 TPD在理论上保证单词依赖项是纯净的,即,依赖模式具有其整体分布含义,其基础分布无法有条件地分解。我们的方法在文本分类任务上的表现优于最新技术。
展开▼