Re: GSoC update.
Hi Kohei,
This is a fantastic news. You have achieved the most important milestone of the project!
I'd suggest a simple experiment to investigate based on the current vllm-gpu package:
1. benchmark our own packaged vllm-gpu on a small model (so it fits to VRAM) and
get a rough output tokens per second metric.
2. benchmark the upstream vllm binary release (from pypi) using the same small model,
same GPU and same nvidia driver. And get the output tokens per second metric.
3. See if the speed matches. If not, that means we may have missed something.
I'll do my parts of reviewing and uploading for the remaining packages.
On Wed, 2025-08-27 at 15:06 +0900, 千代航平 wrote:
> I successfully built the vllm-cuda version and XFormers for the vllm backend!!
>
> Now, I will check the VLLM in the various settings and do more tests with various GPU settings.
>
> Please let me know if you have any concerns or something I need to do in the end of packaging a quite big project.
>
>
> Also, now I try to update the numpy and tokenizers witty pyo3 0.25.
>
> Thanks for the big help!
>
> Regards.
> ------------------------------------------------------------------------------------------------------
> kouhei.sendai@gmail.com
> Kohei Sendai
> -------------------------------------------------------------------------------------------
Reply to: