Skip to content
Open
Overdue by 9 day(s)
Due by June 26, 2026
Last updated Jun 26, 2026

Planned work: harden the OpenAI-compatible audio endpoints (TTS + STT, MLX FFI and panic safety), new models (Mellum 2) and Gemma 4 video input, a ComputeBackend abstraction seam, MoE and dense decode performance gaps vs mlx-lm, disaggregated-serving expansion (/v1/completions, multi-node routing and failover), and native Windows + CUDA builds with packaging hardening.

72% complete

List view