Yasr Questions, Optimizing Output for Speech
Hi,
Just got Yasr working with Festival Lite and it seems to work just fine. A
couple of questions related to Yasr and shells, though:
Is there a way of making Festival Lite go faster? I already set the speed to
the maximum in the Yasr options but would prefer something even higher.
I've noticed that the speed studderes a bit and seems to strangely
accelerate or decelerate at times. I haven't had any problems if cursoring
man pages a line at a time in review mode. However, if it starts reading a
screenful, such as a man page automatically, the rate seems to vary somewhat
and there are slight clicking sounds also. I'm using an AWE32 card with the
OSS drivers, so this might also affect matters. My machine is a 300 MHz
Celeron with 128 MB RAM and a pre-compiled 2.4 kernel.
One explanation might be that I haven't quite gotten used to the intonation
and rhythm of Festival Lite. It sounds somehow pretty unpleasant to me
considering it's a sample based synth. Yet the intonation is something out
of an early 90s formant synth, well almost.
Another query, I'd like to do some quite heavy shell customization and
output processing to make matters easier with speech. One important goal to
me is, as reading with speech is basically a sequential process, getting the
most important information first.
In a long ls-listing, I'd like to have the file name first as else I won't
know to which of the listed files all of the other info applies. As security
is important and managable in Linux, I'd like to have the permission flags
next in some speech friendly form, preferrably as an octal triplet. Then
there should be some audible separator character or a slight pause and then
the file size in MB as well as the date and other info if it fits.
For things like this, which way of customizing would you recommend? Of
course it's worth lookking at what ls offers but I do know there's no
facility for re-ordering the columns. Secondly, I can write and hopefully
read, hhe, some Perl so modding the ls output with that might also be an
option. Howabout other less familiar tools such as shell scripting or awk
for that matter?
Another important thing for me would be to minimize redundancy in spoken
output. I don't know about linux but on other platforms I've often seen
compilers that prepend the full file path before the actual error message
and don't let you customize the format. in a case like this, the only option
is to listen the whole path every time or else use some read from cursor
till end of line command.
For cases like this and other similar issues, I think an optional redundancy
filter might be nice. I suppose this might even be an original idea, as I
haven't seen it implemented in any screen reader yet. Basically it should
compare the previous and current output line(s) being sent to the screen
reader, and kill all alphabetic sub-strings that have the same starting
indeces in the input. Again, this should be very customizable, maybe as a
regex, and the above suggestion is only the default.
How difficult would it be to create a quick-n-dirty Festival build to try
this out locally on my own machine? I do know C but I'm not familiar with
Linux-specific functions or system calls. I know what CVS is but that's
about it. I already downloaded the sources in compressed form and it would
appear that they are surprisingly clean and small as there are only a
handful of core source files in there.
--
With kind regards Veli-Pekka Tätilä (vtatila@mail.student.oulu.fi)
Accessibility, game music, synthesizers and more:
http://www.student.oulu.fi/~vtatila
Reply to: