Construction of an English-Uyghur WordNet Dataset

来源 :第十八届中国计算语言学大会暨中国中文信息学会2019学术年会 | 被引量 : 0次 | 上传用户：boycant

【摘要】

：

【作者】

：

Kahaerjiang Abiderexiti Zhiyuan Liu Maosong Sun

【机构】

：

Department of Computer Science and Technology Institute for Artificial Intelligence State Key Lab on

【出处】

：

第十八届中国计算语言学大会暨中国中文信息学会2019学术年会

【发表日期】

：

2019年8期

【关键词】

：

Uyghur WordNet Dataset Synset mapping

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Automatically building semantic resources is essential to low resource-languages like Uyghur.However,Uyghur suffers from a lack of publicly available evaluation dataset for automatically building semantic resources like WordNet.To cope with this problem,first,we build the largest Uyghur-English and English-Uyghur dictionaries by exploiting many possible online and offline resources.Then by using Princeton WordNet(PWN)3.0 and Contemporary Uyghur Detailed Dictionary(CUDD),we construct an English-Uyghur WordNet evaluation dataset which is publicly available(https://github.com/kaharjan/uywordnet).In this dataset,more than 73,000 English synsets are mapped Uyghur automatically,in which over 20,000 are annotated manually.And the corresponding Uyghur words include definition and examples in Uyghur language context.We also propose a Synset Mapping based on Word Embeddings(SMWE)method.The experimental results on the dataset are promising.

其他文献

泰安市国家地下水监测工程运行维护与管理探讨

地下水监测研究工作是国民经济建设的一项基础工作,是水利、水文事业的重要组成部分.根据《国家地下水监测工程(水利部分)山东省监测井建设工程第10标段合同》要求,2017年7月31日泰安市完成49眼自动监测井的土建工作,安装自动监测仪器后,2018年正式投入运行,国家地下水监测站建设完成后,如何更好的做好运行维护与管理工作已成为地下水管理工作中的重中之重.本文结合泰安市国家地下水监测工程运维与管理中存

会议

地下水监测工程运维管理

中小加工装配企业MTS-MTO混合生产模式库存控制研究

学位

孔洞二氧化硅纳米粒子对葡萄糖氧化酶负载及肿瘤细胞运输的应用

学位

2020~2021年度黄河宁蒙河段凌情特性分析

基于黄河宁夏、内蒙古河段实地查勘和实测资料进行了分析.研究总结了宁蒙河段2020～2021年度凌情特点.黄河宁蒙河段2020～2021年度凌情具有流凌封冻前气温高,流量大,河段流凌、封冻时间接近常年;封河流量大,首封河段出现几封几开现象;盖面冰层厚;槽蓄量增量小,开河过程释放完全;个别断面封河水位高;开河时间早、速度快、开河过程未出现大的凌峰流量;全线开通日期为有资料以来最早等特点.形成本年度凌情

会议

黄河宁蒙河段凌情特性气象条件河道冲刷水库调度

基于120GHz调频连续波的一体化雷达水位计研究

为解决城市洪涝监测预警预报与应急响应中城市地下管网水位精准监测的难题,在调研分析城市地下管网水位监测的现状的基础上,研究基于120GHz调频连续波的一体化雷达水位计的技术路线,为城市地下管网水位精准监测提供一种性价比高的解决方案.

会议

城市地下管网水位监测雷达水位计结构设计调频连续波

Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-Training

Neural Machine Translation(NMT)has recently achieved the state-of-the-art in many machine translation tasks,but one of the challenges that NMT faces is the lack of parallel corpora,especially for low-

会议

Low-resource Neural Machine TranslationCharacter-levelWeight sharingPre-train

Automatic Judgment Prediction via Legal Reading Comprehension

Automatic judgment prediction aims to predict the judicial results based on case materials.It has been studied for several decades mainly by lawyers and judges,considered as a novel and prospective ap

会议

ERCNN:Enhanced Recurrent Convolutional Neural Networks for Learning Sentence Similarity

Learning the similarity between sentences is made difficult by the fact that two sentences which are semantically related may not contain any words in common limited to the length.Recently,there have

会议

Sentence SimilarityERCNNSoft attention mechanism

Point the point:Uyghur morphological segmentation using PointerNetwork with GRU

Uyghur is an agglutinative language that has many mor-phemes.It is necessary for processing Uyghur to segment words into morphemes.This work is called morphological segmentation.Previous works treat m

会议

morphological segmentationUyghurlinguistagglutinative languagePointerNetwork

Improving a Syntactic Graph Convolution Network for Sentence Compression

Sentence compression is a task of compressing sentences containing redundant information into short semantic expressions,simplifying the text struc-ture and retaining important meanings and informatio

会议

sentence compressiongraph convolution networksequence-to-sequence

Construction of an English-Uyghur WordNet Dataset

与本文相关的学术论文