[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: I'm working on an article



Aaron Chantrill, le dim. 16 nov. 2025 18:25:44 -0500, a ecrit:
> On 11/12/25 19:08, Jason J.G. White wrote:
> > 
> > On 12/11/25 10:17, Aaron Chantrill wrote:
> > > I'm working on an article for Linux Magazine. For this article, I'm
> > > interested in talking about setting up speech dispatcher with
> > > different text to speech engines, like Piper TTS or Coqui TTS. This
> > > is based on a question from this mailing list a couple of months
> > > ago. I'm hoping to start a series on accessibility issues while
> > > deepening my own understanding.
> > 
> > For screen reader users, minimizing audio latency is important.
> > Unfortunately,
> > 
> >  the neural network-based TTS systems, including Coqui and Piper, have a
> > reputation for producing high latency. This is an important reason why
> > screen reader users tend not to use them.
> > 
> > I don't know whether this is improved if you have appropriate GPU
> > processing for the neural network models. Piper was unusably slow on my
> > machine, but I didn't investigate deeply enough to find out whether it
> > was using the GPU.
> > 
> Piper when run as a command line program is unusably slow because it has to
> load the full onnx model every time you call it. My goal is to use piper's
> built-in http server.

To be noted: there is a module with native support:

https://github.com/brailcom/speechd/blob/master/src/modules/cxxpiper.cpp

This would provide much more flexibility than through an http server.

Samuel


Reply to: