测绘学报 ›› 2016, Vol. 45 ›› Issue (11): 1342-1351.doi: 10.11947/j.AGCS.2016.20150408

• 地图学与地理信息 • 上一篇    下一篇

M-Quadtree索引:一种基于改进四叉树编码方法的云存储环境下空间索引方法

付仲良1, 胡玉龙1, 翁宝凤1, 彭瑞2   

  1. 1. 武汉大学遥感信息工程学院, 湖北 武汉 430079;
    2. 浙江省地理信息中心, 浙江 杭州 310012
  • 收稿日期:2015-07-21 修回日期:2016-09-10 出版日期:2016-11-20 发布日期:2016-12-03
  • 通讯作者: 杨元维 E-mail:yyw_08@whu.edu.com
  • 作者简介:付仲良(1965-),男,博士,教授,博士生导师,主要研究方向为GIS、矢量数据匹配。E-mail:fuzhl@263.net

M-Quadtree Index: A Spatial Index Method for Cloud Storage Environment Based on Modified Quadtree Coding Approach

FU Zhongliang1, HU Yulong1, WENG Baofeng1, PENG Rui2   

  1. 1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China;
    2. Geomatics Center of Zhejiang, Hangzhou 310012, China
  • Received:2015-07-21 Revised:2016-09-10 Online:2016-11-20 Published:2016-12-03

摘要: 为了解决基于“键-值”模型的云存储环境仅支持简单的关键字查询,不支持多维空间查询的问题,提出了一种新的分布式空间索引方法——M-Quadtree索引。在索引构建过程中,设计了一种基于改进四叉树的空间数据划分方法,该方法规定了叶节点区域的最小数据量,通过四叉树叶节点的再合并,解决了划分后各子区域间存储量不平衡的问题,并且满足了MapReduce并行化要求。给出了MapReduce框架下M-Quadtree索引的快速构建、查询与更新算法,并在搭建的Hadoop平台进行了关键参数对索引效率的影响以及不同规模数据下索引的创建、查询和更新试验。与现有分布式空间索引的对比试验及分析结果表明,M-Quadtree索引在数据存储量负载均衡、算法并行化和空间查询效率等方面表现得更好。

关键词: 云存储, MapReduce, 空间数据管理, 空间索引, 空间数据划分

Abstract: Currently, the cloud storage platform based on key-value model can only support simple keyword queries but cannot support multidimensional spatial queries. To solve the problem, this paper puts forward a new method of distributed spatial index-M-Quadtree index. In the process of index building, a space partitioning method based on improved quadtree was proposed. This partitioning method specifies the minimum amount of data in the leaf area. By recombining the quad leaves, it solves the problem of storage imbalance among sub regions, and meets the parallel requirements of the MapReduce. This paper describes some algorithms about M-Quadtree index building,querying and updating under the MapReduce framework. In the experiments, we implement the M-Quadtree index on Hadoop platform to test the effect of key parameter on the efficiency of index, and also test the efficiency of index building, querying and updating under different scale of data. Comparing with existing distributed spatial index, experiments show that the M-Quadtree index performs better on data load balancing, algorithm parallelism and the efficiency of spatial querying.

Key words: cloud storage, MapReduce, spatial data management, spatial index, spatial data partition

中图分类号: