论文部分内容阅读
以维吾尔语小学语文教材语料为验证对象,利用从语法语义相结合角度制定的《现代维吾尔语词干词类标注标记集》,对维吾尔语小学语文教材词干进行了词性标注,验证该标记集规范的可行性、适应性和可靠性。首先介绍小学语文教材电子语料库;其次讨论《信息处理用现代维吾尔语词干词类标注标记集》的基本情况和多策略现代维吾尔语词干标注系统模型设计与算法;最后分析现代维吾尔语词性标注标记集验证结果,并验证《信息处理用现代维吾尔语词干词类标注标记集》的科学性,补充和改正部分词类的语义分类及标注代码,提出了规范的扩充建议。
Taking the corpus of Chinese teaching materials in Uyghur primary school as verification object, this paper makes a part-of-speech tagging on the stem of Chinese Uyghur language teaching materials by using “modern Uighur stem-word tagging set” from the perspective of grammatical semantics. Feasibility, adaptability and reliability. First of all, we introduce the electronic corpus of Chinese textbooks for elementary schools. Secondly, we discuss the basic situation of modern Uyghur stem-word class annotation markers for information processing and the design and algorithm of multi-strategy modern Uyghur stemming system. Finally, we analyze the modern Uyghur part- As a result, and verify the scientific “modern Uyghur word stem tagging tag set for information processing”, and add and correct part of the semantic classification and tagging code, put forward the normative expansion proposal.