首页> 中文期刊> 《计算机技术与发展》 >一种海量中文地址转化与切割的方法研究

一种海量中文地址转化与切割的方法研究

         

摘要

针对在传统单节点计算模式下,处理海量中文地址数据时不能直接地进行复杂空间数学计算,并且容易受节点硬件条件限制而出现内存溢出和计算速度慢的问题,文中提出了一种中文地址信息通过第三方接口转成对应的经纬度坐标数据,再运用改进后的PRBP-DI分区算法,将海量数据切分成若干子分区分别计算的方法.减少PRBP算法中,对分区数据块列或行重复进行的扫描计算和累积求和计算.真实数据集上的实验结果表明,通过该方法能将海量中文地址数据转化并切分成分布均匀的若干子分区,且算法耗时并不一直随数据点个数增加而增大,提高了海量中文地址数据并行计算的能力和准确性.并根据两种分区算法各自的耗时变化,分析了算法耗时在数据量增大到300 000个数据点时反而减小的原因.%In traditional single node calculation mode,the handling of the massive Chinese address data cannot be directly to the complex space mathematical calculations,and susceptible to node hardware conditions and problems of memory and computing speed is slow,put forward a kind of Chinese address information through third-party interface into the corresponding latitude and longitude coordinates da-ta,using the improved PRBP-DI partition algorithm,to cut the huge amounts of data into several sub partition calculation method respec-tively. Reduce scanning and cumulative sum calculation to partition data block columns or rows of repeated in PRBP algorithm. Real data sets on the experimental results show that by this method can convert massive Chinese address data and cut into uniform distribution of a number of partitions,and the algorithm is time-consuming does not always increase with increasing number of data points,improving the ability of the massive Chinese address data parallel computation and accuracy. And their respective time-consuming changes according to the two kinds of partition algorithm,analyze the cause that the algorithm' s time is decrease when taking the data quantity increases to 300 000 data points.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号