论文部分内容阅读
根据藏语和汉语在发音上的相似性,提出了一种基于隐Markov模型(hidden Markov model,HMM)的汉藏双语语音合成方法。以声韵母为合成基元,采用多个普通话说话人和1个藏语说话人的语料库,利用说话人自适应训练,获得一个汉藏双语混合语言的平均音模型。通过说话人自适应变换,从混合语言的平均音模型获得普通话或藏语的说话人相关模型,从而合成出普通话或藏语语音。实验结果表明,在藏语训练语句较少的情况下,该方法合成的藏语语音明显优于仅采用说话人相关模型合成的藏语语音。
According to the pronunciation similarity of Tibetan and Chinese, this paper proposes a new method of Chinese-Tibetan bilingual speech synthesis based on hidden Markov model (HMM). Taking the vowels as the synthesis primitives, a corpus of multiple Mandarin speakers and one Tibetan speaker is used, and a speaker-adaptive training is used to obtain an average sound model of a bilingual mixed language of Chinese and Tibetan. Through the speaker’s adaptive transformation, the speaker-related model of Putonghua or Tibetan language is obtained from the average sound model of the mixed language so as to synthesize Mandarin or Tibetan speech. Experimental results show that Tibetan speech synthesized by this method is obviously better than Tibetan speech synthesized by using only the speaker-dependent model under the condition of few Tibetan training sentences.