论文部分内容阅读
在分析语音识别系统中,基于规则方法和统计方法的语言模型,提出了一种对规则进行量化的合成语言模型.该模型既避免了规则方法无法适应大规模真实文本处理的缺点,同时也提高了统计模型处理远距离约束关系和语言递归现象的能力.合成语言模型使涵盖6万词条的非特定人孤立词的语音识别系统的准确率比单独使用词的TRIGRAM模型提高了4.9%(男声)和3.5%(女声).
In the analysis of speech recognition system, a language model based on regular methods and statistical methods is proposed, and a synthetic language model for quantifying rules is proposed. This model not only avoids the shortcomings that regular methods can not deal with large-scale real text processing, but also improves the ability of statistical models to deal with long-distance constraints and language recursion. The synthetic language model made the speech recognition system of unspecified isolated words with 60,000 entries increased by 4.9% (male voice) and 3.5% (female voice) more accurately than the word-independent TRIGRAM model.