Spectrum warping based on sub-glottal resonances in speaker-independent speech recognition

来源 :Chinese Journal of Acoustics | 被引量 : 0次 | 上传用户:ilovelp222222
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
To reduce degradation in speech recognition due to varied characteristics of different speakers,a method of perceptual frequency warping based on subglottal resonances for speaker normalization is proposed.The warping factor is extracted from the second subglottal resonance using acoustic coupling between subglottis and vocal tract.The second subglottal resonance is independent of the speech content,which reflects the speaker characteristics more than the third formant.The perceptual minimum variation distortionless response(PMVDR) coefficient is normalized,which is more robust and has better anti-noise capability than MFCC. The normalized coefficients are used in the speech-mode training and speech recognition.Experiments show that the word error rate,as compared with MFCC and the spectrum warping by the third formant,decreases by 4%and 3%respectively in clean speech recognition,and by 9%and 5%respectively in a noisy environment.The results indicate that the proposed method can improve the word recognition accuracy in a speaker-independent recognition system. To reduce degradation in speech recognition due to to characteristics of different speakers, a method of perceptual frequency warping based on subglottal resonances for speaker normalization is proposed. The warping factor is extracted from the second subglottal resonance using acoustic coupling between subglottis and vocal tract. second subglottal resonance is independent of the speech content, which reflects the speaker characteristics more than the third formant. The perceptual minimum variation distortionless response (PMVDR) coefficient is normalized, which is more robust and has better anti-noise capability than MFCC. The normalized coefficients are used in the speech-mode training and speech recognition. Experiments show that the word error rate, as compared with MFCC and the spectrum warping by the third formant, decreases by 4% and 3% respectively in clean speech recognition, and by 9 % and 5% respectively in a noisy environment. The results indicate that the proposed method can improv e the word recognition accuracy in a speaker-independent recognition system.
其他文献
文章介绍了大气污染物的组成,阐述了大气污染对社会的危害,表述了流动源排放的污染现状,并就总量减排体系中颗粒物、碳氢化合物、氮氧化物等流动源污染的影响进行分析,提出流动源
请下载后查看,本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.
近年来的场地环境事件引起我国政府的高度重视,环保部门采取了相应的应对措施。与此同时,国家加快了对于污染场地修复治理的资金投入,在这种政策引导下,相应的学术论文和准立数量
是三月的最後一天了,也是第七十屆議會最後閉幕的一天;這時,全東京的樱花,正開得燦爛奪目,一般有閑階级的人士,都蜂迷蝶醉似的,過着一年一度的瘋狂季節。突然,霹露一聲,第七
狂犬病的预防注射,由于疫苗副作用和潜伏期长,除极少的一部分研究人员外,一般在咬伤后进行“暴露后的疫苗疗法”,给予连续10针以上的免疫注射。狂犬疫苗或使用方法新进展有
为了改善混凝土的脆性,大约在十五年以前,欧美各国有组织地开始研究在混凝土中掺入钢纤维,近几年在土木、建筑和其他领域中已积极试验应用。一、钢纤维混凝土的一般性质 1.
杜连仁的连环画.早就别具一格.使人叹服的是它始终有着一种真挚的激情和讲究的、凝练的笔墨.他的画有所追求.既讲真实也讲浪漫.“直抒胸臆”是我国艺术创造理论中的好传统.
中国农科院蔬菜花卉研究所黄瓜育种组承担“六五、七五、八五”国家攻关课题——黄瓜新品种选育及育种技术研究。先后育成中农1101、中农2号、中农3号、中农4号、中农5号5个
城关镇米村米俊岭,全家四口人,承包土地8.5亩,自83年在县农业局技术站的指导下,充分利用冬春两闲季节,选定了利用庭院发展平菇致富的路子,他多次下德州,跑沧州,、保定等地,
处罚不是最终的目的,最重要的是兼顾"人与自然的和谐"和"政府与企业的和谐".以"情、理、法"作为分析和解决问题的主线,以"和谐"与"发展"作为衡量工作作风是否成熟的准绳.切实