Bug#1034091: RFP: whisper -- Robust Speech Recognition via Large-Scale Weak Supervision

To: 1034091@bugs.debian.org
Subject: Bug#1034091: RFP: whisper -- Robust Speech Recognition via Large-Scale Weak Supervision
From: Petter Reinholdtsen <pere@hungry.com>
Date: Thu, 15 Feb 2024 20:00:38 +0100
Message-id: <[🔎] sa6mss1wo15.fsf@hjemme.reinholdtsen.name>
Reply-to: Petter Reinholdtsen <pere@hungry.com>, 1034091@bugs.debian.org
In-reply-to: <sa6fs6kkexz.fsf@hjemme.reinholdtsen.name>
References: <sa67cumhr96.fsf@hjemme.reinholdtsen.name> <sa6edop1ix0.fsf@hjemme.reinholdtsen.name> <sa6pm83vbbg.fsf@hjemme.reinholdtsen.name> <sa6v8husa2w.fsf@hjemme.reinholdtsen.name> <sa6fs6kkexz.fsf@hjemme.reinholdtsen.name> <sa6jzymicr5.fsf@hjemme.reinholdtsen.name>

I just came across the article "Whispering in Norwegian: Navigating
Orthographic and Dialectic Challenges" Per E Kummervold, Javier de la
Rosa, Freddy Wetjen, Rolv-Arild Braaten and Per Erik Solberg,
<URL:https://arxiv.org/pdf/2402.01917.pdf>.

I found this quote particularly interesting:

  Although the original PyTorch training code was not released by
  OpenAI, a collaborative effort with HuggingFace led to an alternative
  implementation in the Transformers library.  This has also been
  adapted for Jax. The project participated in developing and
  open-sourcing training scripts for TPU-v4-pods, enabling dynamic
  changes to the training data during runtime (The National Library of
  Norway, 2024).

The reference point to <URL: https://www.github.com/NbAiLab/nostram >.
I have not investigated further.  Perhaps the alternative implementation
can be used to make a model from scratch and provide source for the
files requested by the ftpmasters?

Unrelated to this, there is an alternative implementation using the
whisper models called whisper.cpp, available from
<URL: https://github.com/ggerganov/whisper.cpp.git >.  It might be
easier to package than the openai whisper implementation.

-- 
Happy hacking
Petter Reinholdtsen

Reply to:

Prev by Date: Processed: RFS: oaknut/2.0.2-1 [ITP] -- C++20 assembler for AArch64 (ARMv8.0 to ARMv8.2)
Next by Date: Bug#1050314: marked as done (ITP: golang-github-sigstore-rekor -- Software Supply Chain Transparency Log)
Previous by thread: Processed: RFS: oaknut/2.0.2-1 [ITP] -- C++20 assembler for AArch64 (ARMv8.0 to ARMv8.2)
Next by thread: Bug#1050314: marked as done (ITP: golang-github-sigstore-rekor -- Software Supply Chain Transparency Log)
Index(es):
- Date
- Thread