Speech re-synthesis from spectrogram image through sinusoidal modelling
No Thumbnail Available
Date
2014
Authors
Singhal, Rahul
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
A novel method to extract parameters i.e. frequencies and their bandwidth for intelligible speech synthesis is presented in the paper. The parameters are extracted from the spectrogram image of the pre-recorded male and female voice samples and used to re-synthesize speech by employing sinusoidal signals. The phase continuity is preserved by quantifying time-scale and identifying phase at temporal boundaries for a given frequency. The amplitude distribution of the sinusoidals follow Gaussian distribution and use frequency overlap to extend the bandwidth from 4 kHz to 6 kHz for the improvement in clarity of synthesized speech. The synthesized speech is further passed through a weighting filter to improve the envelope of re-synthesized time-domain signal. The synthesized speech is synthetic but noticeably intelligible.
Description
Keywords
EEE, Parameter extraction, Intelligible speech synthesis, Sinusoidal synthesis, Synthetic speech, Gaussian filter