Many individuals, organizations, and companies have to answer large amounts of emails. Often, most of these emails contain variations of relatively few frequently asked questions. We address the problem of predicting which of several frequently used answers a user will choose to respond to an email. Our approach effectively utilizes the data that is typically available in this setting: inbound and outbound emails stored on a server. We take into account that there are no explicit references between inbound and corresponding outbound mails on the server. We map the problem to a semi-supervised classification problem that can be addressed by the transductive Support Vector Machine. We evaluate our approach using emails sent to a corporate customer service department.
展开▼