[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Viable speech recognition tools?

On 5/21/21 6:54 AM, Richard Owlett wrote:
> On 05/20/2021 10:25 AM, Aaron wrote: 
> I notice PocketSphinx in the Debian repositories.
>>> I'm assuming training to my voice and speaking style. I want
>>> continuous speech and as large a vocabulary as possible.
>> Kaldi, Deepspeech and FlashlightASR all recommend Linux for your
>> environment. Kaldi and FlashlightASR are currently aimed at
>> researchers, so
>> they are not easy to set up and the documentation is full of
>> intimidating formulas and technical jargon. Mozilla Deepspeech is
>> somewhat gentler to work with and seems to have better support, plus it
>> can be installed with a simple "pip install deepspeech". Mozilla
>> Deepspeech and FlashlightASR both use KenLM language models by default,
>> while Kaldi uses a variety of language models.
> The links I found seemed to suggest Deepspeech was aiming at people
> like me. An important feature is that it is open source. However, I
> found on article suggesting Mozilla was winding down its development.
> [
> https://venturebeat.com/2021/04/12/mozilla-winds-down-deepspeech-development-announces-grant-program/
> ]
I wouldn't worry too much about it. Even if they are winding down
development, it sounds like they will continue to be involved with it,
and they have done a truly remarkable job in getting it to the point it
is now. In a couple of years the theory will shift a lot to new sorts of
neural network structures, so the open source ASR community will
probably have to start over with a new approach at some point. The big
issue for Mozilla is finding a way to pay for continued development. If
people are really using the tool and aware of it that would go a long
way towards making it easier to justify funding. Also, the lovely thing
about open source is that if people want to continue to use it, they can
maintain and develop it themselves.
>> I'm currently working on a research project where I am trying to compare
>> the current state of different speech recognition engines and classify
>> them according to strengths and weaknesses. If I can be helpful, please
>> let me know.
> Is there a recommended download site which has an associated user
> network, be it mailing list or USENET {preferred}. I find web based
> fora unusable.
I use the Github site https://github.com/mozilla/DeepSpeech when I have
questions. The developers have been surprisingly responsive in the past,
which I greatly appreciate. Hopefully the community will continue to be
responsive. I am not aware of any USENET forums for either DeepSpeech or
Automatic Speech Recognition in general, although if you do find one, I
would be interested in hearing about it.

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply to: