[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Feasibility of speech recognition for note taking on dedicated laptop?



On 11/05/2020 02:08 AM, tomas@tuxteam.de wrote:
On Wed, Nov 04, 2020 at 05:58:25PM -0500, rhkramer@gmail.com wrote:
On Wednesday, November 04, 2020 12:36:51 PM Curt wrote:
Maybe this open source, Java (is that still a thing?) app that runs
on Linux:

http://www.speech.cs.cmu.edu/sphinx/dictator/

Yes, I believe that it is it, but maybe I saw an earlier version (although the
web page listed above is copyrighted something to 2006).

One of the things that made me uncomfortable was that it was written in Java,
and I was concerned about the performance.  But, I never did try it.

Looks like it is pretty much dead -- I tried to access the TWiki but was
denied access.

A more recent project seems to be Mozilla Foundation's DeepSpeech [1]

     "DeepSpeech is an open source embedded (offline, on-device)
      speech-to-text engine which can run in real time on devices
      ranging from a Raspberry Pi 4 to high power GPU servers."

(Sorry for linking to Github. OTOH, they seem to have some page for
this, but it's a Javascript-only white hole [2], so I don't know
what's in there)

[1] https://github.com/mozilla/DeepSpeech
[2] https://commonvoice.mozilla.org/

  - t


My impression of the CMU project(s) is that the focus is more on developers of speech enabled software than end-users of the application.

Initial browsing indicates DeepSpeech will likely be more appropriate.

Some links from [1] state a requirement for JavaScript but display a blank screen even when JavaScript has been enabled. The same is true of [2] itself.

I suspect "browser sniffing" as [https://chat.mozilla.org/#/room/#machinelearning:mozilla.org], pointed to by [https://github.com/mozilla/DeepSpeech/blob/master/SUPPORT.rst], explicitly states:

Your browser can't run Element

Element uses many advanced browser features, some of which are not available or experimental in your current browser.

Please install Chrome, Firefox, or Safari for the best experience.
Use Element on mobile

My current browser is SeaMonkey 2.49.4 [cookies disabled] running on Debian 9. I intend to do an install of Debian 10 on another laptop in the next week an will install current Firefox to see if that is the only problem. I will also try the public machines at the local library as a double-check.

It may be out of date information, but [https://github.com/mozilla/DeepSpeech/wiki#why-cant-i-speak-directly-to-deepspeech-instead-of-first-making-an-audio-recording] says:
Why can't I speak directly to DeepSpeech instead of first making an audio recording?

We are providing inference tools as a way to easily test the system, but building
upon that is open to anyone. Having to deal with more interactive UX is out of the
scope of the current target of those tools.
One of the links I went to explicitly referred to doing "real time" speech recognition [it may have been written by an application developer].

More later.
Thank you.






Reply to: