Projects

allenai/Bolmo-7B

State-of-the-art, fully open-source large language model with latent tokenization. Available in 1B and 7B sizes.

nvidia/Qwen3-8B-DMS-8x

8x KV cache compression without quality degradation. Ideal for inference-time scaling.