[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Yasr Questions, Optimizing Output for Speech

Just got Yasr working with Festival Lite and it seems to work just fine. A couple of questions related to Yasr and shells, though:

Is there a way of making Festival Lite go faster? I already set the speed to the maximum in the Yasr options but would prefer something even higher.

I've noticed that the speed studderes a bit and seems to strangely accelerate or decelerate at times. I haven't had any problems if cursoring man pages a line at a time in review mode. However, if it starts reading a screenful, such as a man page automatically, the rate seems to vary somewhat and there are slight clicking sounds also. I'm using an AWE32 card with the OSS drivers, so this might also affect matters. My machine is a 300 MHz Celeron with 128 MB RAM and a pre-compiled 2.4 kernel.

One explanation might be that I haven't quite gotten used to the intonation and rhythm of Festival Lite. It sounds somehow pretty unpleasant to me considering it's a sample based synth. Yet the intonation is something out of an early 90s formant synth, well almost.

Another query, I'd like to do some quite heavy shell customization and output processing to make matters easier with speech. One important goal to me is, as reading with speech is basically a sequential process, getting the most important information first.

In a long ls-listing, I'd like to have the file name first as else I won't know to which of the listed files all of the other info applies. As security is important and managable in Linux, I'd like to have the permission flags next in some speech friendly form, preferrably as an octal triplet. Then there should be some audible separator character or a slight pause and then the file size in MB as well as the date and other info if it fits.

For things like this, which way of customizing would you recommend? Of course it's worth lookking at what ls offers but I do know there's no facility for re-ordering the columns. Secondly, I can write and hopefully read, hhe, some Perl so modding the ls output with that might also be an option. Howabout other less familiar tools such as shell scripting or awk for that matter?

Another important thing for me would be to minimize redundancy in spoken output. I don't know about linux but on other platforms I've often seen compilers that prepend the full file path before the actual error message and don't let you customize the format. in a case like this, the only option is to listen the whole path every time or else use some read from cursor till end of line command.

For cases like this and other similar issues, I think an optional redundancy filter might be nice. I suppose this might even be an original idea, as I haven't seen it implemented in any screen reader yet. Basically it should compare the previous and current output line(s) being sent to the screen reader, and kill all alphabetic sub-strings that have the same starting indeces in the input. Again, this should be very customizable, maybe as a regex, and the above suggestion is only the default.

How difficult would it be to create a quick-n-dirty Festival build to try this out locally on my own machine? I do know C but I'm not familiar with Linux-specific functions or system calls. I know what CVS is but that's about it. I already downloaded the sources in compressed form and it would appear that they are surprisingly clean and small as there are only a handful of core source files in there.

With kind regards Veli-Pekka Tätilä (vtatila@mail.student.oulu.fi)
Accessibility, game music, synthesizers and more:

Reply to: