最新报告

vLLM · 最新

vLLM — 用户需求报告

周: 2026-W15 生成日期: 2026-04-06 分析 Issue 数: 35 (35 纳入分析) 需求簇: 1

Top 10 用户需求

排名需求Issue 数得分分类示例
1MoE Performance, Quantization, and Backend Stability Fixes354.5Performance#39060, #39030, #39025

上升最快的需求

需求上升倍率本周分类
MoE Performance, Quantization, and Backend Stability Fixes36.0x35Performance

分类分布

  • Performance: 1 个簇

所有需求簇

1. MoE Performance, Quantization, and Backend Stability Fixes

Users are reporting critical issues with Mixture of Experts (MoE) model performance including significant decode throughput regressions, quantization-related accuracy problems with new models like Gemma 4 and Qwen3, and CUDA/ROCm backend stability issues causing crashes and hangs. These fixes are essential for running large-scale MoE deployments reliably and efficiently.

  • 数量: 35 条 issue (31 未关闭, 4 已关闭)
  • 需求得分: 4.5
  • 平均反应: 0.1 | 平均评论: 1.3
  • 示例 Issue: #39060, #39030, #39025, #39010, #39004

本报告仅分析公开 GitHub Issues,代表的是公开讨论中的需求信号,并非全部用户的声音。

ReadYourUsers 生成