首页> 外文会议>IEEE International Conference on High Voltage Engineering and Application >Application of the maintenance text data of transformers based on SimHash and Hamming distance algorithm
【24h】

Application of the maintenance text data of transformers based on SimHash and Hamming distance algorithm

机译:基于SimHash和Hamming距离算法的变压器维护文本数据的应用

获取原文

摘要

Power companies have accumulated a large amount of maintenance data of power equipment in text format. To extract valuable information, text mining is expected. Text similarity is an important method for text mining; however, the feature dimensions increase for long texts. In this paper, the SimHash algorithm is used to map the original text into a 64-bit binary fingerprint, and the similarity between texts is then determined with Hamming distance. With this method, the recommendation model of maintenance measures is established. The verification results show that the recommendation model has good predictions based on the SimHash and Hamming distance algorithm, and is potential to apply in the practical field.
机译:电力公司累计以文本格式累计电力设备的大量维护数据。要提取有价值的信息,预计会挖掘。文本相似性是文本挖掘的重要方法;但是,要素尺寸会增加长文本。在本文中,使用SimHash算法将原始文本映射到64位二进制指纹,然后用汉明距离确定文本之间的相似性。通过这种方法,建立了维护措施的推荐模型。验证结果表明,推荐模型基于SimHash和Hamming距离算法具有良好的预测,并且可能在实际领域应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号