Local AI usage — Benedikt Orri

Qwen3.5 35B A3B (GGUF)(Q4_K_M)

Áður en Gemma 4 kom út notaði ég þetta módel aðallega. En eftir að Gemma 4 kom út hefur hún fengið alla mína athygli.

README

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility. It has 35B total parameters and 3B activated, supporting a native context length of 262,144 tokens.

Highlights

Unified Vision-Language Foundation. Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.

Efficient Hybrid Architecture. Gated Delta Networks combined with sparse Mixture-of-Experts (256 total experts, 8 routed + 1 shared active) deliver high-throughput inference with minimal latency and cost overhead.

Scalable RL Generalization. Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability.

Global Linguistic Coverage. Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuanced cultural and regional understanding.

Gemma 4 26B A4B (GGUF)(Q4_K_M og Q6_K)

README

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on small models) and generating text output.

Gemma 4 introduces key capability and architectural advancements:

Reasoning – All models in the family are designed as highly capable reasoners, with configurable thinking modes.
Extended Multimodalities – Processes Text, Image with variable aspect ratio and resolution support.
Diverse & Efficient Architectures – Offers Dense and Mixture-of-Experts (MoE) variants of different sizes for scalable deployment.
Optimized for On-Device – Smaller models are specifically designed for efficient local execution on laptops and mobile devices.
Increased Context Window – The small models feature a 128K context window, while the medium models support 256K.
Enhanced Coding & Agentic Capabilities – Achieves notable improvements in coding benchmarks alongside native function-calling support, powering highly capable autonomous agents.
Native System Prompt Support – Gemma 4 introduces native support for the system role, enabling more structured and controllable conversations.