LMmy choosing

Model Catalog

New & noteworthy local models you can run on your own machine.

LFM2 is a family of hybrid models designed for on-device deployment. LFM2-24B-A2B is the largest model in the family, scaling the architecture to 24 billion parameters while keeping inference efficient.

Updated 10 days ago

Qwen3-Coder-Next

Qwen3 Coder Next is an 80B MoE with 3B active parameters designed for coding agents and local development. Excels at long-horizon reasoning, complex tool usage, and recovery from execution failures.

Updated 28 days ago

Open source coding models by Z.ai, based on a new base model and specializing in coding and tool calling.

Updated 1 month ago

FunctionGemma is a lightweight, open model from Google, built as a foundation for creating your own specialized function calling models.

Updated 2 months ago

General purpose reasoning and chat model trained from scratch by NVIDIA. Contains 30B total parameters with only 3.5B active at a time for low-latency MoE inference

Updated 2 months ago

GLM 4.6V Flash is a 9B vision-language model optimized for local deployment and low-latency applications.

Updated 2 months ago

Second-generation Devstral for agentic coding. Built for tool use to explore codebases, edit multiple files, and power software engineering agents with newly added vision support.

Updated 2 months ago

Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI.

Updated 2 months ago

Ministral 3 series, available in three model sizes: 3B, 8B, and 14B parameters. Provides best of class cost-to-performance ratio.

Updated 3 months ago

Hybrid attention architecture, high-sparsity Mixture-of-Experts 80B model (active 3B).

Updated 3 months ago

Olmo 3 is a family of Open language models designed to enable the science of language models.

Updated 3 months ago

The olmOCR 2 model is a Vision Language Model (VLM) from Allen AI.

Updated 3 months ago

MiniMax M2 is a 230B MoE (10B active) model built for coding and agentic workflows

Updated 4 months ago

gpt-oss-safeguard

gpt-oss-safeguard-20b and gpt-oss-safeguard-120b are open safety models from OpenAI, building on gpt-oss. Trained to help classify text content based on customizable policies.

Updated 4 months ago

Qwen's latest vision-language model. Includes comprehensive upgrades to visual perception, spatial reasoning, and image understanding.

Updated 4 months ago

Granite 4.0 language models are lightweight, state-of-the-art open models that natively support multilingual capabilities, coding tasks, RAG, tool use, and JSON output.

Updated 4 months ago

Advanced reasoning model from ByteDance with flexible "thinking budget" control and ability to reflect on the length of its own reasoning

Updated 4 months ago

The latest version of the Qwen3 model family, featuring 4B, 30B, and 235B dense and MoE models, both thinking and non-thinking variants.

Updated 4 months ago

OpenAI's first open source LLM. Comes in 2 sizes: 20B and 120B. Supports configurable reasoning effort (low, medium, high). Trained for tool use. Apache 2.0 licensed.

Updated 4 months ago

State-of-the-art, Mixture-of-Experts local coding model with native support for 256K context length. Available in 30B (3B active) and 480B (35B active) sizes.

Updated 4 months ago

Medium-size Mixture-of-Experts model from Baidu's new Ernie 4.5 line of foundation models.

Updated 4 months ago

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

Updated 4 months ago

Devstral is a coding model from Mistral AI. It excels at using tools to explore codebases, editing multiple files and power software engineering agents.

Updated 2 months ago

Gemma 3n is a generative AI model optimized for use in everyday devices, such as phones, laptops, and tablets.

Updated 4 months ago

Mistrall Small is a 'knowledge-dense' 24B multi-modal (image input) local model that supports up to 128 token context length.

Updated 4 months ago

MistralAI's open-weight reasoning model. 24B dense transformer model supporting up to 128K token context window. The model is capable of long chains of reasoning traces before providing answers.

Updated 4 months ago

General purpose dense transformer designed for multilingual use cases. Built in collaboration between MistralAI and NVIDIA.

Updated 4 months ago

Qwen2.5-VL is a performant vision-language model, capable of recognizing common objects and text. Supports context length of 128k tokens in a variety of human languages.

Updated 4 months ago

State-of-the-art image + text input models from Google, built from the same research and tech used to create the Gemini models

Updated 4 months ago

phi-4-reasoning

Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high-quality, reasoning dense data.

Updated 4 months ago

phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets.

Updated 4 months ago

Mistral AI's latest coding model, Codestral can handle both instructions and code completions with ease in over 80 programming languages.

Updated 4 months ago

One of the most popular open-source LLMs, Mistral's 7B Instruct model's balance of speed, size, and performance makes it a great general-purpose daily driver.

Updated 4 months ago

Qwen3 (1st Generation)

The first batch of Qwen3 models (Qwen3-2504), a collection of dense and MoE models ranging from 4B to 235B. These are general purpose models that score highly on benchmarks.

Updated 4 months ago

Distilled version of the DeepSeek-R1-0528 model, created by continuing the post-training process on the Qwen3 8B Base model using Chain-of-Thought (CoT) from DeepSeek-R1-0528.

Updated 4 months ago