测绘学报

• 学术论文 • 上一篇    下一篇

顾及通名语义的汉语地名相似度匹配算法

程钢1,卢小平2   

  1. 1. 河南理工大学矿山空间信息技术国家测绘地理信息局重点实验室
    2. 河南理工大学 矿山空间信息技术国家测绘地理信息局重点实验室
  • 收稿日期:2014-01-21 修回日期:2013-06-12 出版日期:2014-04-20 发布日期:2014-02-18
  • 通讯作者: 程钢

Matching Algorithm for Chinese Place Names by Similarity with Semantics of General Names for Places into Consideration

  • Received:2014-01-21 Revised:2013-06-12 Online:2014-04-20 Published:2014-02-18

摘要:

地名匹配是地理信息检索、多源地理空间数据集成及更新中的关键技术问题。本文根据规范汉语地名构词特点,依据地名通名与地名类型的关系,建立规范地名通名语义知识库,并将由其提供的地名语义作为地名相似度匹配的重要指标。针对基于字面和空间数据的地名匹配方法存在的不足,面向规范地名提出一种综合了地名专名字面相似度和地名通名语义相似度两种因素的复合相似度匹配算法模型。该模型模拟人的认知习惯,根据通名语义相似度程度,通过单调函数关系动态设置专名和通名相似度各自的权重值,利用动态加权方法求得复合地名相似度指标。在上述模型基础上,本文提出了汉语地名匹配策略和流程,利用通名蕴含的语义增强汉语地名匹配算法的理论基础和完备性,提高了地名匹配算法准确率。实验结果表明该模型符合认知习惯,验证了该方法的合理性和有效性。

关键词: 通名, 语义, 本体, 复合相似度, 地名匹配

Abstract:

Matching of Place Names is one of the key issues in geographic information retrieval, integration and update for multi-source geospatial data. According to the constitute characteristics for Chinese Place Names and relations between general names for places and its types; Ontology knowledge base for general names for places has been established, based on which semantic of Place Names is used as an important indicator for matching of Place Names by their similarity. Aiming at overcoming the shorts of queries by literals or geospatial data for place names, we propose a new matching algorithm and query strategy for Chinese Place Names taking both similarities of special names and general names for places into consideration. The method simulates human cognitive habits, in which the weights of similarity for special names and general names for places are set dynamically according to the degree of semantic similarity by a monotonic function relationship. The final composite similarity index for Place Names is weighted average for similarities of special names and general names for Places. Based on the model above, the strategy and flow has been put forward, which enhances the theoretical basis and completeness of matching algorithm for Chinese Place Names by using semantic knowledge from general names for places, and it thereby improves the accuracy of the matching algorithm. The experimental results show that the matching model is consistent with human cognitive habits, and further demonstrate the rationality and effectiveness of this method.

Key words: general names for places, semantic, ontology, composite index, matching of place names