文档介绍:该【基于形态学滤波和HHT的基音检测方法 】是由【niuww】上传分享,文档一共【4】页,该文档可以免费在线阅读,需要了解更多关于【基于形态学滤波和HHT的基音检测方法 】的内容,可以使用淘豆网的站内搜索功能,选择自己适合的文档,以下文字是截取该文章内的部分文字,如需要获得完整电子版,请下载此文档到您的设备,方便您编辑和打印。基于形态学滤波和HHT的基音检测方法
Abstract
Pitch detection in speech signals is an important task in various fields such as speech recognition, speaker identification, and emotion recognition. In this paper, a pitch detection method based on morphological filtering and Hilbert-Huang Transform (HHT) is proposed. The proposed method combines the advantages of morphological filtering in noise reduction and HHT in time-frequency localization. The effectiveness of the proposed method is demonstrated through simulations and experimental results on a speech database.
Introduction
Pitch detection is a fundamental problem in speech processing, where the task is to estimate the fundamental frequency of a speech signal, also known as the pitch or the fundamental period. The fundamental frequency is an important characteristic of speech as it conveys information about the speaker, emotions, and prosody. Therefore, accurate pitch detection is crucial for various applications such as speech recognition, speaker identification, and emotion recognition.
The pitch detection problem is challenging due to the complex nature of speech signals that have a non-stationary and non-linear behavior. Moreover, speech signals are often contaminated with noise, which makes it difficult to identify the true pitch. Therefore, many pitch detection methods have been proposed over the years, which can be broadly classified into model-based and non-model-based approaches.
Model-based methods rely on modeling the speech signal as a combination of sinusoidal waves with varying parameters such as amplitude, frequency, and phase. These methods are computationally intensive and require a priori information about the speech signal, which may not be available in practice. On the other hand, non-model-based methods do not require any assumptions about the underlying signal model and can be used for real-time applications.
In this paper, a novel pitch detection method based on morphological filtering and Hilbert-Huang Transform (HHT) is proposed. The proposed method combines the advantages of morphological filtering in noise reduction and HHT in time-frequency localization. The rest of the paper is organized as follows: Section II discusses the related work on pitch detection. Section III describes the proposed method in detail. Section IV presents simulation and experimental results. Finally, Section V concludes the paper with some future directions.
Related Work
The pitch detection problem has been extensively studied in the literature, and various methods have been proposed based on different principles such as autocorrelation, cepstrum, discrete Fourier transform (DFT), and wavelet transform. These methods have their own advantages and limitations, and the choice of a particular method depends on the characteristics of the speech signal and the application requirements.
In recent years, there has been increasing interest in the use of morphological filtering and HHT for pitch detection. Morphological filtering is a non-linear signal processing technique that can suppress noise and enhance the relevant features of the signal. HHT is a time-frequency analysis method that decomposes the signal into intrinsic mode functions (IMFs) and provides a local time-frequency representation of the signal.
Several studies have reported the use of morphological filtering and HHT for pitch detection. For example, Du et al. [1] proposed a method that uses morphological filtering to suppress the noise and HHT to extract the pitch. Similarly, Xu et al. [2] used HHT in combination with morphological filtering and median filtering for pitch detection. However, these methods have limitations in terms of accuracy and computational complexity.
Proposed Method
The proposed method for pitch detection is based on morphological filtering and Hilbert-Huang Transform (HHT). The basic idea of the proposed method is to use morphological filtering to suppress the noise and enhance the relevant features of the speech signal, followed by HHT to extract the pitch.
The proposed method consists of the following steps:
Step 1: Preprocessing - The speech signal is preprocessed by applying a pre-emphasis filter to boost the high-frequency components of the signal. The signal is then divided into frames of equal duration, and each frame is windowed using a Hamming window.
Step 2: Morphological filtering - The morphological filter is applied to each frame of the speech signal to suppress the noise and enhance the pitch-related information. The morphological filter used in this paper is the opening operation, which is defined as the erosion of the signal by a structuring element followed by the dilation of the resulting signal with the same structuring element.
Step 3: Hilbert-Huang Transform (HHT) - The Hilbert-Huang Transform (HHT) is applied to each filtered frame of the speech signal to extract the pitch. HHT involves two steps: empirical mode decomposition (EMD) and Hilbert transform. EMD decomposes the signal into a set of intrinsic mode functions (IMFs), and Hilbert transform is applied to each IMF to obtain the instantaneous frequency.
Step 4: Pitch estimation - The pitch is estimated from the instantaneous frequency obtained from each IMF using the autocorrelation method. The pitch value with the highest correlation coefficient is selected as the estimated pitch.
Simulation and Experimental Results
To evaluate the effectiveness of the proposed method, simulations and experiments were conducted on a speech signal database. The database consists of male and female speech signals with different levels of noise.
The simulation results showed that the proposed method outperforms the state-of-the-art methods in terms of accuracy and noise robustness. Figure 1 shows the pitch detection performance of the proposed method on a speech signal with 10 dB of white Gaussian noise.
Figure 1: Pitch detection performance of the proposed method on a speech signal with 10 dB of white Gaussian noise.
The experimental results showed that the proposed method achieves an average pitch detection accuracy of %, which is higher than the state-of-the-art methods. Figure 2 shows the comparative results of the proposed method with the state-of-the-art methods on a speech signal with 15 dB of white Gaussian noise.
Figure 2: Comparative results of the proposed method with the state-of-the-art methods on a speech signal with 15 dB of white Gaussian noise.
Conclusion
In this paper, a novel pitch detection method based on morphological filtering and Hilbert-Huang Transform (HHT) was proposed. The proposed method combines the advantages of morphological filtering in noise reduction and HHT in time-frequency localization. The effectiveness of the proposed method was demonstrated through simulations and experimental results on a speech database. The proposed method outperforms the state-of-the-art methods in terms of accuracy and noise robustness. In the future, the proposed method can be extended to real-time applications such as speech recognition and speaker identification.