bi weekly update

Sorry for a bit delay.

Now, the big progress is I could build vllm with cpu version and it works!

I'm really happy about this.

Of course, my next goal is gpu version.

However, I struggle with ray and xformers.

First, ray is build with bazel. However, it has strict version dependencies with bazel it self, and it download source code from the Internet. Copy the code by hand might not be a big issue, but the Bazel version is critical for me. I'm struggling with this problem almost all week....

Also, xformers depend on flash-attention. https://github.com/Dao-AILab/flash-attention/tree/3ba6f826b199ff68aa9e9139a46280160defa5cd. I think I need to build flash-attn first, but is it right? I have experience using flash-attn and I know it depends on torch and cuda versions.

Thanks for your help, I could upload some packages and will upload more.

we discussed the issues https://salsa.debian.org/k1000dai/gsoc-status/-/issues and next package might be ndarray. ndarray depends on https://packages.debian.org/search?keywords=librust-portable-atomic-util-dev and rust team upload this and accepted to experimental. I think it is ready to upload.

Also, I will send the MR in llama.cpp for packaging gguf-py.

I know you are busy, but I hope I can get some advice on handling ray and xformers.

Regards.

-----------------------------------------------------------------------------------------------------

kouhei.sendai@gmail.com

Kohei Sendai

-------------------------------------------------------------------------------------------