论文部分内容阅读
质量优良的语音识别系统或语音合成系统需要高质量的、在语音学和语言学知识指导下设计的科学合理简洁有效的连续语音数据库的支持.在目前阶段,汉语语音数据库应限制在朗读言语(readspeech)的音段方面。为了描写语流中的音变现象,考虑如下语音单元:(1)不计声调的音节(401个)。(2)音节间的双音子415个。(3)音节间的三音子3035个,这是根据37个基本音子,利用音节间共振峰过渡的研究结果,按规则规纳的结果.(4)所有音节间过渡段的韵母一声母结构,采用和同三音子相同的归并方法,共781个.为了增加不同的韵律结构,并考虑语音识别系统的后处理,语料还包括汉语的17类基本句型.选用1993、1994两年的“人民日报”、“百家报刊精选”及若干电视剧本、词典词库作为语料库的原始语料,从中选出2185个句子和388个短语作为朗读语料,它们覆盖了99.8%个无调音节,100%的双音子,99.6%的三音子,以及17类句型。
Good quality speech recognition systems or speech synthesis systems require the support of high-quality, scientifically sound, concise and effective continuous speech databases designed under the guidance of phonetic and linguistic knowledge. At this stage, the Chinese phonetic database should be limited to reading segments of readspeech. In order to describe the phonetic variation in the speech stream, consider the following phonetic units: (1) Silenced syllables (401). (2) Two syllables between syllables 415. (3) Three syllables between syllables, 3035, based on the results of the study, based on 37 basic sonotropes, using formant crosstalk transition studies between syllables. (4) The vowel consonant structure of all the transitions between syllables adopts the same merge method as that of the same three tones, a total of 781. In order to add different prosodic structures and consider the post-processing of the speech recognition system, the corpus also includes 17 types of basic sentence patterns in Chinese. The author selects 2185 sentences and 388 phrases as the reading corpus from “People’s Daily”, “Selected Best Newspapers and Periodicals” and several TV scripts in 1993 and 1994. The dictionary dictionary is used as the original corpus of the corpus. 99.8% toneless, 100% dual tone, 99.6% tri-tone, and 17 patterns.