Speech re-synthesis from spectrogram image through sinusoidal modelling

No Thumbnail Available

Date

2014

Authors

Singhal, Rahul

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

A novel method to extract parameters i.e. frequencies and their bandwidth for intelligible speech synthesis is presented in the paper. The parameters are extracted from the spectrogram image of the pre-recorded male and female voice samples and used to re-synthesize speech by employing sinusoidal signals. The phase continuity is preserved by quantifying time-scale and identifying phase at temporal boundaries for a given frequency. The amplitude distribution of the sinusoidals follow Gaussian distribution and use frequency overlap to extend the bandwidth from 4 kHz to 6 kHz for the improvement in clarity of synthesized speech. The synthesized speech is further passed through a weighting filter to improve the envelope of re-synthesized time-domain signal. The synthesized speech is synthetic but noticeably intelligible.

Description

Keywords

EEE, Parameter extraction, Intelligible speech synthesis, Sinusoidal synthesis, Synthetic speech, Gaussian filter

Citation

Endorsement

Review

Supplemented By

Referenced By