As a powerful representation paradigm for networked and multi-typed data, theheterogeneous information network (HIN) is ubiquitous. Meanwhile, definingproper relevance measures has always been a fundamental problem and of greatpragmatic importance for network mining tasks. Inspired by our probabilisticinterpretation of existing path-based relevance measures, we propose to studyHIN relevance from a probabilistic perspective. We also identify, fromreal-world data, and propose to model cross-meta-path synergy, which is acharacteristic important for defining path-based HIN relevance and has not beenmodeled by existing methods. A generative model is established to derive anovel path-based relevance measure, which is data-driven and tailored for eachHIN. We develop an inference algorithm to find the maximum a posteriori (MAP)estimate of the model parameters, which entails non-trivial tricks. Experimentson two real-world datasets demonstrate the effectiveness of the proposed modeland relevance measure.
展开▼