论文部分内容阅读
为了在语种识别时充分利用人的听感知特性提高识别性能,提出了一种基于听感知模型的特征。听感知特征采用Gammatone滤波器组代替常用的三角滤波器组计算语音信号各子带能量;根据等效矩形带宽模型,确定各滤波器的中心频率与带宽;使用反置等响度曲线模拟人耳对信号不同频率成分的主观响度感受。在基本听感知特征的基础上,还提出了一、二阶差分特征和偏移差分特征用于语种识别。对比实验表明,该文所提的听感知特征性能均优于目前普遍使用的Mel频率倒谱系数(MFCC)特征及其衍生特征。
In order to make full use of human auditory perception to improve recognition performance in speech recognition, a feature based on auditory perception model is proposed. Listening perceptual characteristics The Gammatone filter bank is used instead of the common triangular filter bank to calculate the energy of each sub-band of the speech signal. The center frequency and bandwidth of each filter are determined according to the equivalent rectangular bandwidth model. Subjective loudness perception of different frequency components of signal. On the basis of the basic listening features, the first and second-order difference features and offset difference features are also proposed for language recognition. Comparative experiments show that the performance of the proposed hearing sensing features are superior to the commonly used Mel Frequency Cepstral Coefficients (MFCC) features and their derived features.