Speech re-synthesis from spectrogram image through sinusoidal modelling

dc.contributor.author	Singhal, Rahul
dc.date.accessioned	2023-03-07T09:30:26Z
dc.date.available	2023-03-07T09:30:26Z
dc.date.issued	2014
dc.identifier.uri	https://ieeexplore.ieee.org/document/6968501
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/9572
dc.description.abstract	A novel method to extract parameters i.e. frequencies and their bandwidth for intelligible speech synthesis is presented in the paper. The parameters are extracted from the spectrogram image of the pre-recorded male and female voice samples and used to re-synthesize speech by employing sinusoidal signals. The phase continuity is preserved by quantifying time-scale and identifying phase at temporal boundaries for a given frequency. The amplitude distribution of the sinusoidals follow Gaussian distribution and use frequency overlap to extend the bandwidth from 4 kHz to 6 kHz for the improvement in clarity of synthesized speech. The synthesized speech is further passed through a weighting filter to improve the envelope of re-synthesized time-domain signal. The synthesized speech is synthetic but noticeably intelligible.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	EEE	en_US
dc.subject	Parameter extraction	en_US
dc.subject	Intelligible speech synthesis	en_US
dc.subject	Sinusoidal synthesis	en_US
dc.subject	Synthetic speech	en_US
dc.subject	Gaussian filter	en_US
dc.title	Speech re-synthesis from spectrogram image through sinusoidal modelling	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.