Re: speaking emacs
>>>>> "LP" == Leonardo Pistone <firstname.lastname@example.org> writes:
LP> In the next months i'll be working on an open italian voice for
>>>>> "MF" == Mario Fux <email@example.com> writes:
MF> Sounds very interesting: Which knowledge do one need for another
MF> free festival voice? I'm interested in a German voice but don't
MF> know yet about the scope of the work that needs.
I've been playing with creating a free Czech Festival voice for several
months and now we try to make a serious solution of this within the
Freebsoft project. I've gathered some experience with the process and
if anybody is willing to share/exchange know-how with defining new
Festival languages, I'm open to cooperate.
If you want to define any new language in Festival, you must basically
solve the following things:
- Pronunciation of a written text. Unless you have a free pronunciation
dictionary for your language, it's obviously quite hard in languages
like English, but I guess it should be much easier in languages like
German or Italian. In Czech, acceptable results can be achieved
- Context dependencies in the transformation of written texts to a
phonetic form, like handling numbers written in digits, abbreviations,
etc. It's a hard thing for (almost?) any language, but it needn't be
too complex if you just want to present texts in your application to a
- Defining prosody, e.g. intonation, accents, pauses and duration. This
requires either appropriate know-how for the particular language, or
availability of recorded and labeled data for your language to get
Festival trained. Usually, getting the know-how should be easier.
The problem of prosody can be ignored when building support for a new
language, but good prosodic rules can make the resulting speech output
- Recorded diphone database, i.e. recorded words that contain the basic
elements of speech for your language. To get this, you need a speaker
who agrees to use his voice for the purpose of free speech synthesis
and a decent recording equipment. After the recording is made, you
must label the recorded databases. This is a manually performed
tedious process with a huge impact on the resulting quality, but it
can be performed somewhat incrementally. Alternatively, you can use a
non-free synthesis backend such as Mbrola, which might be useful for
the text analysis and prosody development until you get a free diphone
database; but I don't recommend this as the final solution if you
don't want let your users depend on non-free software.
Making new language support is a long-term task, but it's feasible. As
for Czech, we originally started with zero knowledge, but the process
could be learned and the issue looks promising now. If you can consult
your language problems with linguists, it can add a big advantage.
Generally, the extent of the necessary work depends on the desired
speech output quality.
As for German, there are two things you may want to look at initially.
First, the Epos GPLed speech synthesis system (available in Debian)
contains some support for German. I don't know the current state, you
should preferably contact the upstream authors if you are interested in
it. Unfortunately, free German diphone database for Epos is not
available and AFAIK it's unlikely to be in foreseeable future.
Second, there is a poor attempt of mine at
http://cvs.freebsoft.org/repository/festival-german/ to create a basis
for future Festival German synthesis (again, without a diphone
database), forked from festival-czech in the era of naive approach to
the problem. We are much more far with Czech support now, but if you
are completely lost and/or want to get anything to start with, you can
Generally, the Freebsoft project would welcome if Festival supported
more languages. We can't help with creation of new languages directly
(unless we are provided with necessary resources), but I can help to
some extent with my experience from the work on Czech support in