Improving speech synthesis quality by reducing pitch peaks in the source recordings
Luisina Violante, Pablo Rodriguez Zivic and Agustin Gravano
We present a method for improving the perceived naturalness of corpus-based
speech synthesizers. It consists in removing pronounced pitch peaks in the
original recordings, which typically lead to noticeable discontinuities in the
synthesized speech. We perceptually evaluated this method using two
concatenative and two HMM-based synthesis systems, and found that using it on
the source recordings managed to improve the naturalness of the synthesizers
and had no effect on their intelligibility.
Back to Papers Accepted