In this paper, we describe a novel co-training based algorithm for identifying database user sessions from database traces. The algorithm learns to identify positive data (session boundaries) and negative data (non-session boundaries) incrementally by using two methods interactively in several iterations. In each iteration, previous identified positive and negative data are used to build better models, which in turn can label some new data and improve performance of further iterations. We also present experimental results.
展开▼