In the age of Web 2.0, a substantial amount of unstructuredudcontent are distributed through multiple text streams in anudasynchronous fashion, which makes it increasingly difficultudto glean and distill useful information. An effective way toudexplore the information in text streams is topic modelling,udwhich can further facilitate other applications such as search,udinformation browsing, and pattern mining. In this paper, weudpropose a semantic graph based topic modelling approachudfor structuring asynchronous text streams. Our model in-udtegrates topic mining and time synchronization, two coreudmodules for addressing the problem, into a unified model.udSpecifically, for handling the lexical gap issues, we use globaludsemantic graphs of each timestamp for capturing the hid-udden interaction among entities from all the text streams.udFor dealing with the sources asynchronism problem, localudsemantic graphs are employed to discover similar topics ofuddifferent entities that can be potentially separated by timeudgaps. Our experiment on two real-world datasets shows thatudthe proposed model significantly outperforms the existingudones.
展开▼