针对物联网中服务数量的大规模性、服务描述的异构性以及设备服务的资源高度受限性和移动性等特点,提出了一种基于概率主题模型的物联网服务发现方法。该方法的主要特点是:1)利用英文Wikipedia构建高质量的主题模型,并对类似短文本的服务文本描述进行语义扩充,使主题模型能够更有效地估计服务文本描述的隐含主题;2)提出利用非参数主题模型学习服务文本的隐含主题,降低模型训练时间;3)利用服务隐含主题对服务进行自动分类和文本相似度计算,快速减少服务匹配数量,加速服务文本相似度计算;4)提出能够同时支持WSDL-based和RESTful两种物联网服务的signature匹配算法。实验结果表明:与现有的物联网服务发现方法相比,该方法的准确率(precision)和归一化折损累积增益(NDCG)都有较大幅度的提高。%Internet of things (IoT) contains not only large number of services with heterogeneous description but also mobile and highly resource-constrained devices. It is key issue for IoT to find suitable services efficiently and fast. This paper proposes a service discovery approach based on probabilistic topic model for IoT. The key features of this approach include: 1) using the English Wikipedia to train a topic model with high quality and semantically enrich service text description (a form of short text) to help the topic model to extract latent topics of service more effectively; 2) employing non-parametric topic model to infer latent topics of service, which reduces the training time of the topic model; 3) making full use of the latent topics of service to automatically classify service and calculate the text similarity between service request and service, which rapidly decreases the number of services for logic signature matchmaking and accelerates similarity calculation of service text description; 4) providing a logic signature matchmaking method which supports both WSDL-based and RESTful Web service. The experimental results show that the proposed method performs much better than existing solutions in terms of precision and normalized discounted cumulative gain (NDCG) measurement value.
展开▼