Denoising and Enhancement Speech Signal Using Wavelet

Speech enhancement aims to improve the quality and intelligibility of speech using various techniques and algorithms. The speech signal is always accompanied by background noise. The speech and communication processing systems must apply effective noise reduction techniques in order to extract the desired speech signal from its corrupted speech signal. In this project we study wavelet and wavelet transform, and the possibility of its employment in the processing and analysis of the speech signal in order to enhance the signal and remove noise of it. We will present different algorithms that depend on the wavelet transform and the mechanism to apply them in order to get rid of noise in the speech, and compare the results of the application of these algorithms with some traditional algorithms that are used to enhance the speech. The basic principles of the wavelike transform are presented as an alternative to the Fourier transform. Or immediate switching of the window The practical results obtained are based on processing a large database dedicated to speech bookmarks polluted with various noises in many SNRs. This article tends to be an extension of practical research to improve speech signal for hearing aid purposes. Also learn about the main frequency of letters and their uses in intelligent systems, such as voice control systems.


1-Introduction
With the advent of wavelet analysis, it became popular to address unstable physical quantities, such as speech analysis, voice signature detection, and speech recognition. Wavelets have proven successful in front-end speech recognition processors that are an alternative to instant switching using time-wave resolution. For speech recognition, or voice enhancement, wave shapes are based on Henning's window [1][2][3] The performance of the recognition depends on the frequency domain coverage. The goal of good speech recognition is to increase the wavelength bandwidth without significantly affecting time accuracy. This can be done by collecting the hard-to-detect white noise of the wave and removing it by conventional methods. Speech components will have large values compared to noise because it is considered an extraneous signal. Transactions are calculated using a multi-precision wave filter bank. [4][5][6][7][8][9][10][11][12][13] The filter selection depends on the noise level and other parameters. To obtain a good noise reduction result, a good threshold level must be estimated [2][3][4][5][6][7][8][9][10][11][12]. The wavelet function and the level of decay also play an important and pioneering role in de-noise and noisecanceling signal quality. Recently, various wavelet-based methods have been proposed for the purpose of reducing noise in speech. The wavelength division modulus method is a noise reduction procedure to remove noise by reducing the wavelet coefficients in the wavelet field. The method depends on the threshold in which the signal is each wavelet.

2-Speech Signals
Speech is a natural and basic way for humans to convey message and thoughts. Speech frequency normally ranges between 3 Hz to 4 KHz depending upon the character. However the human beings have an audible frequency range of 20 Hz to 20 KHz. The most common problem in speech processing is the effect of meddling of noise in the speech signals. The noise masks the speech signal reduces the quality and the speech is greatly affected by presence of backdrop noise [1][2][3][4][5][6][7][8][9][10][11], Noise shrinking or speech enrichment algorithm is to improve the performance of communication systems when their input or output signals are corrupted by noise signal [14].  Non-stationary signal analysis methods are focused to  model the inherent time-varying characteristics of the  analyzed signals recorded in several areas namely,  communications, speech analysis and synthesis, radar, biomedical, and mechanical engineering [15], The conventional methods based on the Fourier transform are not well suited for spectroscopy of these signals. Moreover, complex biological signals such as a brain signal or a speech signal [15][16][17].

3-Speech Acquisition With Cool Edit
The Cool Edit program for recording, auditory and visual analysis of sounds and their spectra They will be recorded in monophony at the sampling frequency of 10 kHz with a converter 16 so called the sampling period.

4-Wavelet Transform
The wavelet transform (WT) uses wavelet function and varied scales to decompose signals in the T-F domain, and it can guarantee the temporal and spectral resolutions in the entire frequency range. Since introduced in the 1970s, the WT has been used in varied applications, such as signal detection, imaging processing, de-noising of signal, speed improvement, audio classification, etc [2][3][4][5][6][7][8]. Zhu and Kim applied the Morlet wavelet transform to analyze the impact noise. The wavelets have two important properties: first, the scaling factor, and secondly, the transformation and the relationship between them roughly correspond to the measurement process. Compressed waves are used. When the high-bandwidth waves span, they correspond to the low-frequency signals [14][15][16][17], at lower bands, it corresponds to rapidly changing signals that consist of high frequencies. Unlike other transmission tools (Fourier transforms, etc.) used in signal processing, waves allow analysis of signals in both frequency and time domains. There are two types of wavelet transfers: continuous and discrete wavelet transfers. Both transformations are continuous in time (analog), and with their help analog signals can be represented. [5-6-16].

4-1-General theory of CWT
In this work, we stated only some keys equations and concepts of wavelet transform, more rigorous mathematical treatment of this subject can be found in [3][4][5][6]. A continuous-time wavelet transform of f(t) is defined as: and they are dilating and translating coefficients, respectively. This multiplication of |a| -1/2 is for energy normalization purposes so that the transformed signal will have the same energy at every scale. The analysis function Ψ(t), the so-called mother wavelet has to satisfy that it has a zero net area, whichsuggest that the transformation kernel of the wavelet transform is a compactly support function. [9] A disadvantage of CWT is that the signal representation is often redundant, because a and b are continuous over R (the real number).As the original signal can be completely reconstructed by a model copy of W f (b, a). Usually, we try W f (b, a) in a binary network i.e., a = 2 -m and b = n2m m,n∈Z+. Substituting the last one into where Ψm,n(t) = 2-mΨ(2mt-n) is the dilated and translated version of the mother wavelet Ψ(t). [9,14] SNR is the power of the useful signal to the power of the noise ratio in which meaningful information characterizing the ratio of these power. SNR depended on an additive noise where the undistorted unquantized signal [ ] and an additional quantization error [ ] are superposition generated the quantized signal [ ]. SNR is usually specified in the logarithmic measure in decibels (dB) in order to cover a wide range of possible SNR values: [10] Where , are the average powers of the corresponding signals, and , is the average value of the amplitude. SNR is often called SQNR

4-2-Temporal and Spectral Resolutionsin the CWT
Resolutions in the time and frequency domains are critical for evaluation of performance of different wavelets. The temporal resolution in the time domain and the spectral resolution in the frequency domain of the CWT can be defined as [5.6]: DWT is mathematical tool for decomposing data in a top down fashion. DWT represent a function in terms of a rough overall form, and a wide range of details. Despite of the requirement and type of function i.e., signals images etc. DWT offers sublime methodology and technique for representing the amount of detail present.
Wavelets perform and offer scale based analysis for a given data. A wide range of applications and usage has been found for these wavelets including signal processing, mathematics and numerical analysis and, for its better performance in signals and image processing it is considered an alternative to Fast Fourier Transform as DWT provide time frequency representation When there is a need for processing and analyzing non stationary tool, DWT can be used .Study shows that discrete wavelets transform have high performance in speech signal processing so far. [18][19].

5-Speech enhancement methods
There are various speech enhancement methods proposed for noise reduction and to improve the speech quality and clearness. Only one algorithm. is not enough for all the types of noise present in the surrounding. Hence speech enhancement algorithms are created based on the application she block diagram of speech enhancement is show in figure (2)   In this method, we rely on the processing of the audio signal stored in the database, where noise can be removed and the main frequencies of each letter can be identified. This method can be used in real time after developing the algorithm in embedded system that manages the voice signal and identifies certain commands that control an automatic system.

5-1-Detection of Singularity of the impulse noise signal in CWT
For noise evaluation, the oscillation of the acoustic signals is regarded as a considerable important metric. The CWT is often applied to detect the singularities of a transient signal.
On every stage of numerical simulations standard procedures for calculations of wavelet coefficients were used (in the case of the continuous as well as discrete wavelet transforms), which are integral parts of MATLAB software. For these calculations we employed MATLAB cwt function.  We usually focus on three main frequencies for letter and word identification, after removing noise and applying wavelet transform. In practice, the third frequency can be neglected because it may be close in several letters, and we are satisfied with the first and second frequency only, especially if the noise is removed at a satisfactory rate. The figure opposite shows the letter E after noise cancellation

5-2-Enhancement vowel "A", by the Wavelets
The following figure shows the resulting multi-resolution vowel "A", by the Wavelets Fig.10 vowel "A", Levels of Decomposition by the Wavelets A: original speech signal, a1: appro 1 , d1: levele 1 , a2: appro 2, d2:level 2, A=a1 + d1 + a2 + d2  The method is based on thresholding in the signal that each wavelet coefficient of the signal is compared to a given threshold. Using wavelets to remove noise from a signal requires identifying which components contain the noise, and then reconstructing the signal without those components.
Unlike STFT which has constant resolution at all times and at all frequencies, WT has good temporal resolution and low frequency resolution at high frequencies, and good frequency resolution and low temporal resolution at low frequencies.
In The figure 7, the red represent the maximum spectral intensity, while the blue represents the minimum frequency. Fig.11 the CWT, Daubechies db12  Does the system recognize single words or continuous speech? Obviously, it is easier to recognize isolated words well separated by periods of silence than to recognize the sequence of words constituting a sentence. Indeed, in the latter case, not only is the border between words no longer known but, moreover, the words become strongly articulated (i.e. the pronunciation of each word is affected by the word preceding as well as by the one that followsa simple and well-known example being the links of French). The following figure shows the Slice of a vowel A zone of the word / slap / with CWT from Daubechies. Note that the formants presented on the same scale, the following. Second Scale 51 56 The following figure shows the Slice of a vowel / U / of the word / Une / with CWT from Daubechies   The figure (15), presented the classification of the vowels forming according to the scales corresponding with 10 speakers. Fig.15 the classification of the vowels After recording the speech signal for several people, and when applying the proposed method, the vowels were divided into three groups, so we can say that there is a convergence of the first and second scale in some values for each group.

6-Conclusion
The analog acquisition of the speech signal is the first thing to consider for spectral analysis of vowels or isolated words. Without it, it would be impossible to decompose this signal correctly and accurately in order to study it. This representation is not always the best for most signal processing applications. In many cases, the most relevant information is hidden in the frequency component of the Signal. The frequency SPECTRUM of a signal is constituted by the frequency components of this signal. The frequency spectrum of a signal indicates which frequencies exist in the signal. The wavelet decomposition is similar to the Gabor decomposition: a speech signal is written in the form of a superposition of such offset and dilated wavelets.
If we get rid of the noise in the audio signal, the main frequencies are recognized in a short time and with great precision, especially in word recognition programs, and this is what we have discussed in this article where the wavelet conversion can be used to reduce noise and gain an understanding of sound. Denoising of speech signals has been achieved successfully using wavelets. This paper provides a practical approach on how noisy audio (in wavelet form) incorporated with white gaussian noise can be denoised by using the coiflet wavelet.