Language Model for Mongolian Polyphone Proofreading

来源 :第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会 | 被引量 : 0次 | 上传用户：dengliguo1971

【摘要】

：

Mongolian text proofreading is the particularly difficult task because of its unique polyphonic alphabet,morphological ambiguity and agglutinative feature,and coding errors are currently pervasive in

【作者】

：

Min Lu Feilong Bao Guanglai Gao

【机构】

：

College of Computer Science,Inner Mongolia University,Hohhot 010021,China

【出处】

：

第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会

【发表日期】

：

2017年7期

【关键词】

：

Mongolian Polyphone Automatic Proofreading System Morpho-logical Ambiguity

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Mongolian text proofreading is the particularly difficult task because of its unique polyphonic alphabet,morphological ambiguity and agglutinative feature,and coding errors are currently pervasive in the Mongolian corpus of electronic edition,which results in Mongolian statistic and retrieval research toughly difficult to carry out.Some conventional approaches have been pro-posed to solve this problem but with limitations by not considering proofread-ing of polyphone.In this paper,we address this problem by means of construct-ing the large-scale resource and conducting n-gram language model based ap-proach.For ease of understanding,the entire proofreading system architecture is also introduced in this paper,since the polyphone proofreading is the im-portant component of it.Experimental results show that our method performs pretty well.Polyphone correction accuracy is relatively improved by 62%and overall system accuracy is relatively promoted by 16.1%.

其他文献

DIM Reader:Dual Interaction Model for Machine Comprehension

Enabling a computer to understand a document so that itcan answer comprehension questions is a central,yet unsolved goal of Natural Language Processing,so reading comprehension of text is an important

会议

machine comprehensionbi-directional attentiondual in-teraction modelCloze-sty

Generating Textual Entailment Using Residual LSTMs

Generating textual entailment(GTE)is a recently proposed task to study how to infer a sentence from a given premise.Current sequence-to-se-quence GTE models are prone to produce invalid sentences when

会议

Generating Textual EntailmentNatural Language GenerationNat-ural Language Proc

Multi-view LSTM Language Model with Word-synchronized Auxiliary Feature for LVCSR

Recently long short-term memory language model(LSTMLM)has received tremendous interests from both language and speech communities,due to its superiorty on modelling long-term dependency.Moreover,integ

会议

LSTM language modelspeech recognitionmulti-viewaux-iliary featuretagging mod

Tibetan Syllable-based Functional Chunk Boundary Identification

Tibetan syntactic functional chunk parsing is aimed at identifyingsyntactic constituents of Tibetan sentences.In this paper,based on the Tibetan syntactic functional chunk description system,we propos

会议

Tibetan Syntactic Functional ChunkChunk Boundary Recogni-tionSyllableSyntacti

Unsupervised Joint Entity Linking over Question Answering Pair with Global Knowledge

We consider the task of entity linking over question answering pair(QA-pair).In conventional approaches of entity linking,all the entities whether in one sentence or not are considered the same.We foc

会议

joint entity linkingquestion answering pairglobal knowledgein-tegral linear p

Harvest Uyghur-Chinese Aligned-Sentences Bitexts from Multilingual Sites Based on Word Embedding

Obtaining bilingual parallel data from the multilingual websites is along-standing research problem,which is very benefit for resource-scarce lan-guages.In this paper,we present an approach for obtain

会议

bilingual parallel dataword embeddingresource-scarce languages

Closed-Set Chinese Word Segmentation Based on Convolutional Neural Network Model

This paper proposes a neural model for closed-set Chinese word segmentation.The model follows the character-based approach which assigns a class label to each character,indicating its relative po-siti

会议

Chinese word segmentationDeep learningConvolutional neural networks

Improving Event Detection via Information Sharing among Related Event Types

Event detection suffers from data sparseness and label imbalance prob-lem due to the expensive cost of manual annotations of events.To address this problem,we propose a novel approach that allows for

会议

Hierarchical Gated Recurrent Neural Tensor Network for Answer Triggering

In this paper,we focus on the problem of answer triggering ad-dressed by Yang et al.(2015),which is a critical component for a real-world question answering system.We employ a hierarchical gated recur

会议

Answer TriggeringQuestion AnsweringHierarchical gated recur-rent neural tensor

Joint Extraction of Multiple Relations and Entities by using a Hybrid Neural Network

This paper proposes a novel end-to-end neural model to jointly extract entities and relations in a sentence.Unlike most exist-ing approaches,the proposed model uses a hybrid neural network to automati

会议

Information ExtractionNeural Networks

Language Model for Mongolian Polyphone Proofreading

与本文相关的学术论文