论文部分内容阅读
为提高语音端点检测在强噪声环境下的准确率,提出了一种基于交叉熵顺序统计滤波(OSF)的语音端点检测算法。该算法以子带交叉熵为语音/非语音的区分特征,首先将每帧语音的频谱划分成若干个子带,估计出每个子带能量与背景噪声之间的交叉熵,然后把相继若干帧的子带能量交叉熵经过一组顺序统计滤波器,最后根据各帧交叉熵的值对输入的语音进行分类。实验结果表明:该算法能够有效地区分语音和非语音。特别是在强噪声环境下依然能够保持很高的检测率,具有鲁棒性。通过实验结果比较,该算法在性能上优于最近提出的基于能量顺序统计滤波和单纯交叉熵判别的两种方法。
In order to improve the accuracy of speech endpoint detection in strong noisy environments, a speech endpoint detection algorithm based on Cross Entropy Order Statistical Filter (OSF) is proposed. This algorithm uses sub-band cross-entropy as the distinguishing feature of speech / non-speech. Firstly, the frequency spectrum of each speech is divided into several sub-bands to estimate the cross-entropy between the energy of each sub-band and the background noise. The sub-band energy cross-entropy passes through a set of sequential statistical filters, and finally the input speech is classified according to the cross-entropy value of each frame. Experimental results show that this algorithm can effectively distinguish between speech and non-speech. Especially in the strong noise environment can still maintain a high detection rate, with robustness. The experimental results show that the proposed algorithm is superior to the recently proposed two methods based on energy order statistics filtering and simple cross entropy discrimination.