Mistral Small 4, one model for reasoning, visioning, and coding

Maxime Hiez
Mistral AI
30 Apr, 2026

Introduction

Managing multiple specialized models within a single AI pipeline adds deployment complexity and multiplies infrastructure costs. Mistral AI announced on March 16, 2026 the launch of Mistral Small 4, a single model that absorbs the roles previously held by Magistral (reasoning), Pixtral (vision), and Devstral (code). For an enterprise AI admin, the message is straightforward : one API integration, three specialties covered.

Architecture and capabilities

Mistral Small 4 is built on a Mixture of Experts (MoE) architecture with 119 billion total parameters, of which approximately 6 billion are activated per token. The model uses 128 experts with 4 active simultaneously, keeping the computational footprint at inference low despite the model’s raw size.

Key characteristics :

Context window : 256,000 tokens, sufficient to process long documents or extended conversation histories.
Native multimodality : Text and image as input, text as output. Small 3 was strictly text-only; that limitation is gone.
Configurable reasoning : The new reasoning_effort parameter accepts the values none or high, making it possible to switch between a fast response and deep reasoning based on the use case, without changing models.
Language support : 24 languages including French, English, Spanish, German, Chinese, Japanese, and Arabic.

What changes compared to Small 3

The most significant difference from Mistral Small 3 is not purely about performance, it is the consolidation of several models into a single deployment point. Previously, a complete AI stack required routing requests to distinct models depending on the task, a vision model like the one introduced in Mistral OCR, a code model like Mistral Code. Small 4 eliminates that routing.

On raw performance, Mistral claims a 40% reduction in end-to-end latency and three times higher throughput in requests per second compared to Small 3. These figures are published by Mistral without a detailed methodology; they should be treated as indicative until an official technical report is available. Independent measurements by Artificial Analysis place the model at 171.8 tokens per second with a time-to-first-token (TTFT) of 0.76 seconds via the Mistral API.

Benchmarks

In the absence of an official technical report at the time of the announcement, the results below come from secondary sources and Mistral publications. They provide a positioning indication but do not constitute a complete, independent evaluation.

	Mistral Small 4	GPT-4o-mini
GPQA Diamond	71.2%	40.2%
MMLU-Pro	78.0%	64.8%
LiveCodeBench	Outperforms GPT-OSS 120B (−20% tokens)	-

On the AA LCR (weighted response length) benchmark, Small 4 scores 0.72 for 1,600 characters, against ranges of 5800 to 6100 characters for Qwen 3.5-122B, reflecting a notably concise output style.

Access and deployment

Mistral Small 4 is available under the Apache 2.0 license, authorizing commercial use without revenue restrictions, unlike Meta’s LLaMA conditions, which impose usage thresholds based on organization size.

Available access channels at launch:

Mistral API and AI Studio : Pricing sourced via Artificial Analysis and OpenRouter: $0.15 per million input tokens, $0.60 per million output tokens, not yet published on the official Mistral pricing page at time of writing.
Hugging Face : mistralai/Mistral-Small-4-119B-2603
NVIDIA NIM, vLLM, llama.cpp, SGLang, Transformers, Axolotl (fine-tuning)

For self-hosting, the minimum hardware requirement is significant:

4× NVIDIA HGX H100
2× HGX H200
1× DGX B200

This level of infrastructure rules out consumer GPU deployments or lightweight on-premises setups; the API remains the most accessible option for most organizations.

Conclusion

Mistral Small 4 addresses a concrete enterprise AI stack problem, instead of maintaining three separate models for reasoning, vision, and code, a single deployment covers all three. The Apache 2.0 license, the reasoning_effort parameter, and the 256k context make it a serious candidate for diverse automation workflows. The absence of an official technical report is a reason to validate benchmarks against internal use cases before any production commitment.

Sources

Mistral AI - Mistral Small 4

Test Le Chat by Mistral AI

Hugging Face - mistralai/Mistral-Small-4-119B-2603

Artificial Analysis - Mistral Small 4

OpenRouter - mistralai/mistral-small-2603

MindStudio - What is Mistral Small 4

Emelia - Mistral Small 4 Complete Guide

Did you enjoy this post ? If you have any questions, comments or suggestions, please feel free to send me a message from the contact form.

Don’t forget to follow us and share this post.

Tags :

Nearly 70% of Fortune 500 companies use Copilot

Maxime Hiez
Copilot
20 Nov, 2024

Introduction At Microsoft Ignite 2024, Microsoft highlighted why nearly 70% of Fortune 500 companies now use Microsoft 365 Copilot. This mass adoption reflects a growing trend in the indu

How to disable self-service on Copilot licenses

Introduction Microsoft has activated a setting in the tenants (by default) to allow any user to purchase a Microsoft Copilot license through the *Microsoft 365 Copilot self-service pursha

Mistral Large 24.11 transforms industries with cutting-edge AI

Maxime Hiez
Mistral AI
15 Dec, 2024

Introduction Microsoft recently announced the release of Mistral Large 24.11, an advanced language model (LLM) available in the Azure AI model catalog. This new version sets a new benchma

Improved Teams video quality with Super Resolution

Maxime Hiez
Teams
06 Feb, 2025

Introduction Microsoft continues to innovate to provide users with the best possible virtual communication experience. One of the latest advancements is the introduction of *Super Resolutio

Le Chat by Mistral AI, your personal AI assistant

Maxime Hiez
Mistral AI
10 Feb, 2025

Introduction I told you last December about the French AI, Mistral AI, the most popular model in Europe in which Microsoft invested 15 million euros in the startup. The mobile app has jus

New Yealink MeetingBoard 65 and 85 for Teams rooms

Maxime Hiez
MTR
13 Feb, 2025

Introduction The new Yealink MeetingBoard 65 and 85 are an innovative and comprehensive solution designed to transform meeting rooms into intelligent collaboration spaces. These all-in-on

Maximize the use of the Copilot prompt gallery

Maxime Hiez
Copilot
19 Feb, 2025

Introduction Microsoft 365 Copilot continues to revolutionize the way organizations work by integrating advanced artificial intelligence capabilities into everyday tools. One of the key f

How to get started with Copilot in Excel

Maxime Hiez
Copilot
20 Feb, 2025

Introduction Microsoft 365 Copilot is a major innovation that integrates artificial intelligence directly into the applications you use every day, like Excel. Copilot helps you automate t

Microsoft Purview for Azure Data Lake and Blob Storage

Maxime Hiez
Purview
21 Feb, 2025

Introduction Microsoft announced that Microsoft Purview protection policies for Azure Data Lake and Blob Storage are now available in all regions. This advancement allows organization

Facilitator, new AI agent for taking notes in meetings

Maxime Hiez
MTR
08 Mar, 2025

Introduction Microsoft recently announced a new feature for Teams Rooms: Facilitator ; an AI agent that takes notes during Teams meetings. This feature is currently in pre-public release

Enterprise Connect 2025 : Yealink SkySound CM50 Dante kit

Maxime Hiez
MTR
20 Mar, 2025

Introduction Enterprise Connect is an annual conference that brings together communications technology professionals, innovators, and others. This event showcases technological advances i

Mistral OCR, new benchmark in character recognition

Maxime Hiez
Mistral AI
18 Apr, 2025

Introduction In March 2025, Mistral AI announced the launch of Mistral OCR, an optical character recognition (OCR) API that sets a new standard in document understanding. This advance

Introducing the Logitech Rally Board 65

Maxime Hiez
MTR
28 Apr, 2025

Introduction The Logitech Rally Board 65 is an all-in-one video conferencing solution designed to simplify meetings and collaboration in business environments. With its 65-inch touchscree

Mistral Code, the European AI development assistant

Maxime Hiez
Mistral AI
09 Jun, 2025

Introduction French startup Mistral AI, already recognized for its open source language models, has just unveiled Mistral Code, an intelligent development assistant designed for businesse

New Yealink MeetingBar A50 for Teams Rooms

Maxime Hiez
MTR
16 Jul, 2025

Introduction In an increasingly hybrid work world, businesses are looking for video conferencing solutions that are powerful, easy to deploy, and seamlessly integrated into their *Microsoft

Mercedes-Benz, your car becomes a rolling office

Maxime Hiez
Teams
21 Jul, 2025

Introduction In an automotive market increasingly focused on smart and connected mobility, Mercedes-Benz is taking a giant leap forward. With the new generation of the CLA model, the Ge

Anthropic unveils Claude Opus 4.1, faster and more reliable

Maxime Hiez
Anthropic
08 Aug, 2025

Introduction Anthropic, a leading player in artificial intelligence, has announced the release of Claude Opus 4.1, a significant update to its flagship model (Claude Opus 4). Designed

OpenAI unveils GPT-5, its latest smarter model

Maxime Hiez
OpenAI
11 Aug, 2025

Introduction OpenAI has taken another step forward in the evolution of artificial intelligence with the launch of GPT-5, its most powerful language model to date. Designed to be smarter

What's new for Copilot in August 2025

Maxime Hiez
Copilot
03 Sep, 2025

Introduction Microsoft releases a monthly update to Microsoft 365 Copilot to keep admins and users up-to-date on productivity-enhancing features in Microsoft 365. The August 2025 release

Anthropic unveils Claude Sonnet 4.5, more advanced

Maxime Hiez
Anthropic
30 Oct, 2025

Introduction Anthropic, a leading player in artificial intelligence, has announced the release of Claude Sonnet 4.5, touted as the world's best coding model and a significant leap for

How to enable DSPM for AI with Purview

Introduction With the rise of generative AI models, the phenomenon of Shadow AI (the use of artificial intelligence tools and services not approved or controlled by organizations) is incr

Mistral OCR 3, a precise, structured and affordable OCR

Maxime Hiez
Mistral AI
15 Jan, 2026

Introduction In December 2025, Mistral AI announced the launch of Mistral OCR version 3, an Optical Character Recognition (OCR) API that sets a new standard for document understanding

How to add a disclaimer in Copilot

Introduction Microsoft has enabled a setting in tenants that allows administrators to display the Microsoft 365 Copilot disclaimer in bold, and to attach a shortcut pointing to a usage po

Extend Zero Trust to AI agent identities in Entra ID

Maxime Hiez
Entra ID
30 Jan, 2026

Introduction AI agents are becoming increasingly widespread in businesses (incident summaries, log analysis, flow execution, etc.), and it is crucial that their access is continuously evalu

Mistral Voxtral Transcribe2, real-time transcription

Maxime Hiez
Mistral AI
05 Feb, 2026

Introduction Mistral AI has just unveiled Voxtral Transcribe 2, its second generation of speech transcription models with cutting-edge transcription quality, ultra-low latency and advan

How to enable DLP for AI websites with Purview

Introduction Last week, I showed you how to enable DLP to prevent printing of financial data using Microsoft Purview, in order to prevent accidental or malicious data leaks (*Data Loss

Anthropic unveils Claude Opus 4.6, a benchmark for finance

Maxime Hiez
Anthropic
13 Feb, 2026

Introduction Artificial intelligence is rapidly growing in the finance industry, but one reality remains : real-world financial analyses are rarely clean, linear, or perfectly defined. They

How to enable Claude AI as a model in Copilot

Introduction Since its launch, Microsoft 365 Copilot has established itself as a cornerstone of enhanced enterprise productivity, leveraging advanced AI models to reason, analyze, and aut

OpenAI unveils GPT-5.4, the new generation of models

Maxime Hiez
OpenAI
09 Mar, 2026

Introduction OpenAI has just announced GPT-5.4, a new evolution of its GPT model family. Designed for professional uses and complex tasks, this model introduces several major improvemen

Introducing Microsoft 365 E7, the Frontier Suite

Maxime Hiez
Microsoft 365
10 Mar, 2026

Introduction Microsoft has announced the availability of the Microsoft 365 E7 license, a new offer called Frontier Suite, designed for the era of AI-driven work and agents. This announc

How to download Cisco Webex recorded calls via API

Introduction Cisco Webex Contact Center offers advanced call recording capabilities, essential for quality, compliance, and continuous service improvement. Supervisors can easily listen t

OpenAI unveils GPT-5.5, designed for agentic work

Maxime Hiez
OpenAI
28 Apr, 2026

Introduction OpenAI has just announced GPT-5.5, barely seven weeks after GPT-5.4. The message is clear, GPT-5.5 is not an incremental update but *"a new class of intelligence for real

Voice-native agents in Foundry in Public Preview

Maxime Hiez
Foundry
05 May, 2026

Introduction Microsoft announced on March 16, 2026 the Public Preview of Voice Native Agents in Microsoft Azure AI Foundry, a native combination of the Voice Live API and the *Found

Anthropic unveils Claude Opus 4.7, with a new tokenizer

Maxime Hiez
Anthropic
12 May, 2026

Introduction Anthropic announced on April 16, 2026 the general availability of Claude Opus 4.7, the direct successor to [Claude Opus 4.6](https://maxime.hiez.ca/en/blog/2026-02-13-ai-an

GPT-5.5 Instant now available in Copilot

Maxime Hiez
Copilot
14 May, 2026

Introduction Microsoft 365 Copilot is accelerating the renewal cadence of its underlying models ; this is the fifth iteration of the GPT-5.x series deployed in less than a year. On 7 May 20