atom

nvidia/Qwen3-8B-DMS-8x

8x KV cache compression without quality degradation. Ideal for inference-time scaling.