Seeking information about Web Speech API

To: debian-accessibility@lists.debian.org
Subject: Seeking information about Web Speech API
From: Aaron Chantrill <aaron.chantrill@dottywood.org>
Date: Tue, 1 Oct 2024 13:43:21 -0400
Message-id: <[🔎] 71392d9f-d874-4ac5-ae09-2e07d58cd493@dottywood.org>

I recently discovered the Web Speech API(https://wicg.github.io/speech-api/) which has apparently beenimplemented in both Chrome and Firefox. In fact, this appears to havebeen the impetus behind the Mozilla DeepSpeech project (later renamedSTT, then spun off into Coqui).

The idea appears to be to allow websites to use JSpeech grammar formatto design telephone tree type interfaces for voice navigation.

The current implementations seems completely broken on Firefox. There isa demo page athttps://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/which currently works in Chrome, but in Firefox you have to go intoabout:config and enable both media.webspeech.recognition.enable andmedia.webspeech.recognition.force_enable to get it to work at all, andeven then all you get as far as a result is "Error occurred inrecognition: network". Apparently Firefox was using the Google Cloud STTservice, but that appears to have been shut down in addition to theMozilla DeepSpeech test endpoint.

There was a setting in Firefox: media.webspeech.service.endpoint, whichcould be used to set a specific back end server, but this appears tohave been removed, so now that the default endpoint no longer works thewhole api is basically useless.

On Chrome, of course, you just have to trust whatever endpoint they areusing. I highly doubt they are doing STT on-device, so it must be goingto a service somewhere - probably still Google Cloud STT.

Does anyone know anything about the current state of this technology?I'm working on a library using WebRTC to implement STT in the browserusing AJAX to contact a back end VOSK or Whisper service, but if anyoneis already working on something like this, I'd like to know before I gettoo involved.


Thanks!

Reply to:

Next by Date: Flatpak Accessibility
Next by thread: Flatpak Accessibility
Index(es):
- Date
- Thread