Hierarchical Approximate Matching for Retrieval of Chinese Historical Calligraphy Character

来源 :计算机科学技术学报(英文版) | 被引量 : 0次 | 上传用户:ebeggar
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
As historical Chinese calligraphy works are being digitized, the problem of retrieval becomes a new challenge. But, currently no OCR technique can convert calligraphy character images into text, nor can the existing Handwriting Character Recognition approach does not work for it. This paper proposes a novel approach to efficiently retrieving Chinese calligraphy characters on the basis of similarity: calligraphy character image is represented by a collection of discriminative features, and high retrieval speed with reasonable effectiveness is achieved. First, calligraphy characters that have no possibility similar to the query are filtered out step by step by comparing the character complexity, stroke density and stroke protrusion. Then, similar calligraphy characters are retrieved and ranked according to their matching cost produced by approximate shape match. In order to speed up the retrieval, we employed high dimensional data structure-PK-tree. Finally, the efficiency of the algorithm is demonstrated by a preliminary experiment with 3012 calligraphy character images.
Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are two popular feature ex- traction techniques in statistical patt recognition field.
传染性非典型肺炎,世界卫生组织(WHO)将其命名为严重急性呼吸综合征(severe acute respiratory syndrome,SARS),是由新型冠状病毒(coronavirus,SARS-CoV)引起的急性呼吸道传
1.9MSB-2.1型牧草免耕松土补播机rn 该机由内蒙古农牧业机械化研究所研制生产,并由呼和浩特市新天科技开发中心经销。rn 该机是与47.8~58.8kW轮式拖拉机相配套的、牧草松