Re: whisper-cpp with AMD GPUs (Was: llama-cpp with AMD GPUs)

To: debian-ai@lists.debian.org
Subject: Re: whisper-cpp with AMD GPUs (Was: llama-cpp with AMD GPUs)
From: Petter Reinholdtsen <pere@hungry.com>
Date: Fri, 16 Feb 2024 11:17:11 +0100
Message-id: <[🔎] sa634tsvhlk.fsf@hjemme.reinholdtsen.name>
In-reply-to: <[🔎] 7002cf70-632d-431b-9045-da9582871abf@slerp.xyz>
References: <[🔎] 39620f27-dbd1-4756-ab3d-e7464712b880@slerp.xyz> <[🔎] 24d1c8ab-62cf-4632-afc6-6a264dfbba5b@debian.org> <[🔎] 30f19094-2e83-4953-97f0-abcaa2ce6333@slerp.xyz> <[🔎] sa6zfw82cyb.fsf@hjemme.reinholdtsen.name> <[🔎] d928eda2-d831-e785-7745-0207c3895280@slerp.xyz> <[🔎] 7002cf70-632d-431b-9045-da9582871abf@slerp.xyz>

[Cordell Bloor]
> It seems to work fairly well on my Radeon VII.

It worked on one of my test machines doo, using GeForce GT 755M and the
OpenCL backend.  I built using

  cmake -S. -Bbuild cmake . -DWHISPER_CLBLAST=ON
  -DCMAKE_BUILD_TYPE=Release

I could not get the CUDA stuff working, not sure why.

I am also not sure how much the GPU is used, but it do print out these
when running, at least:

  ggml_opencl: selecting platform: 'NVIDIA CUDA'
  ggml_opencl: selecting device: 'NVIDIA GeForce GT 755M'

The clocktime spent transcribing the jfk.wav sample is 34.8s with OpenCL
support compiled in, and 41.2s using the CPU, so I guess it has some
effect (15.5% less time spent).

Perhaps someone should set up a project to transcribe all Debian videos
using Whisper, to provide searchable text for each Debconf presentation
and other talks.

Perhaps <URL: https://bugs.debian.org/1034091 > is better solved using
whisper.cpp?

-- 
Happy hacking
Petter Reinholdtsen

Reply to:

References:
- llama-cpp with AMD GPUs
  - From: Cordell Bloor <cgmb@slerp.xyz>
- Re: llama-cpp with AMD GPUs
  - From: Christian Kastner <ckk@debian.org>
- Re: llama-cpp with AMD GPUs
  - From: Cordell Bloor <cgmb@slerp.xyz>
- Re: llama-cpp with AMD GPUs
  - From: Petter Reinholdtsen <pere@hungry.com>
- Re: llama-cpp with AMD GPUs
  - From: Cordell Bloor <cgmb@slerp.xyz>
- whisper-cpp with AMD GPUs (Was: llama-cpp with AMD GPUs)
  - From: Cordell Bloor <cgmb@slerp.xyz>

Prev by Date: Re: ROCm CI for OpenCL packages?
Next by Date: Bug#1064070: RFP: cxxheaderparser -- python library for parsing C++ headers
Previous by thread: whisper-cpp with AMD GPUs (Was: llama-cpp with AMD GPUs)
Next by thread: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++
Index(es):
- Date
- Thread