Thanks for the clarification.
FWIW, vLLM community recently proposes a refactoring, named vLLM V1 [1, 2]. If possible, I think we can directly work towards the newly codebase. Besides, SGLang [3] is another popular serving framework for LLMs, and we may consider include it if applicable.
Best,
Xuanteng
|