Re: bi weekly update

To: 千代航平 <kouhei.sendai@gmail.com>, debian-ai@lists.debian.org
Subject: Re: bi weekly update
From: "M. Zhou" <cdluminate@riseup.net>
Date: Mon, 14 Jul 2025 11:51:16 -0400
Message-id: <[🔎] ba39af37-724d-418a-aaa3-afcf4ee6ccc7@riseup.net>
In-reply-to: <[🔎] CAMN-Zwk6XHV0a-ex6_aDTUB11HJqV6eAyQuqwQ6uvgKSiziMOw@mail.gmail.com>
References: <[🔎] CAMN-Zwk6XHV0a-ex6_aDTUB11HJqV6eAyQuqwQ6uvgKSiziMOw@mail.gmail.com>

Hi Kohei,

On 7/13/25 10:15 AM, 千代航平 wrote:

Sorry for a bit delay.

Now, the big progress is I could build vllm with cpu version and it works!
I'm really happy about this.

Wonderful! I'll continue giving you feedback on pending packages andprovide feedbacks to you.

We can upload the CPU version first and pass the NEW queue. That meansthe vllm-cpu and vllm-cuda can be organized like how src:pytorch andsrc:pytorch-cuda does.

Of course, my next goal is gpu version.
However, I struggle with ray and xformers.



Is ray a mandatory dependency?

First, ray is build with bazel. However, it has strict versiondependencies with bazel it self, and it download source code from theInternet. Copy the code by hand might not be a big issue, but theBazel version is critical for me. I'm struggling with this problemalmost all week....

Bazel's design principle does not fit well in the Debian's context. Itis more suitable for source-based distributions or distributions that donot worry much about unix file system hierarchy, etc. Updating bazelalone can be already a new gsoc project. If possible, we'd better avoidthe packages depending on bazel, and avoid touching bazel at all, forthe current project.

Also, xformers depend on flash-attention.https://github.com/Dao-AILab/flash-attention/tree/3ba6f826b199ff68aa9e9139a46280160defa5cd.I think I need to build flash-attn first, but is it right? Ihave experience using flash-attn and I know it depends on torch andcuda versions.

Dependency on torch-cuda should not be a problem. What you need shouldbe just


libtorch-cuda-dev, python3-torch-cuda, nvidia-cuda-toolkit-gcc

Do you mean flash-attn has any particular version requirement on torch/cuda?

Thanks for your help, I could upload some packages and will upload more.
we discussed the issueshttps://salsa.debian.org/k1000dai/gsoc-status/-/issues and nextpackage might be ndarray. ndarray depends onhttps://packages.debian.org/search?keywords=librust-portable-atomic-util-devand rust team upload this and accepted to experimental. I think it isready to upload.



Will continue processing the queue when I'm available.

Also, I will send the MR in llama.cpp for packaging gguf-py.



Great! Christian can help merging additional modifications if you have any.

I know you are busy, but I hope I can get some advice on handling rayand xformers.

As suggested above. Don't worry, bazel is a hard bit for everyone. Youare making a good progress.


Please keep up the good work!

Regards.

-----------------------------------------------------------------------------------------------------
kouhei.sendai@gmail.com

Kohei Sendai

-------------------------------------------------------------------------------------------

Reply to:

References:
- bi weekly update
  - From: 千代航平 <kouhei.sendai@gmail.com>

Prev by Date: llama.cpp split into several subpackages
Next by Date: Re: bi weekly update
Previous by thread: Re: bi weekly update
Next by thread: Re: bi weekly update
Index(es):
- Date
- Thread