Mistral AI

Overview

Mistral AI provides open-weight and commercial models with strong performance, efficient inference, and competitive pricing. Access Mistral through Portkey for European-hosted AI with excellent multilingual capabilities. Base URL: https://api.mistral.ai/v1

Supported Features

✅ Chat Completions
✅ Streaming
✅ Embeddings
✅ Function Calling
✅ JSON Mode
✅ Fill-in-the-middle (FIM)
❌ Vision (not yet available)
❌ Image Generation
❌ Fine-tuning

Quick Start

Chat Completions

from portkey_ai import Portkey

client = Portkey(
    provider="mistral-ai",
    Authorization="***"  # Your Mistral API key
)

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[
        {"role": "user", "content": "Explain the Mistral models"}
    ]
)

print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "Write a poem about Paris"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Available Models

Commercial Models

Model	Context	Description	Best For
`mistral-large-latest`	128K	Most capable Mistral model	Complex reasoning, multilingual
`mistral-large-2411`	128K	Latest Mistral Large	Production applications
`mistral-large-2407`	128K	July 2024 version	Stable release
`mistral-medium-latest`	32K	Balanced performance	General purpose
`mistral-small-latest`	32K	Fast and efficient	Simple tasks, high volume

Open-Weight Models

Model	Context	Description
`open-mistral-nemo`	128K	Latest open model
`open-mixtral-8x22b`	64K	Mixture of Experts (MoE)
`open-mixtral-8x7b`	32K	Efficient MoE
`open-mistral-7b`	32K	Compact model

Specialized Models

Model	Type	Description
`codestral-latest`	Code	Code generation
`codestral-mamba-latest`	Code	Efficient code model
`mistral-embed`	Embeddings	Text embeddings (1024 dims)

Mistral excels at:

Multilingual tasks (French, English, Spanish, German, Italian)
Code generation with Codestral
Efficient inference with MoE architecture
European data residency (GDPR compliant)
Instruction following and function calling

Configuration Options

Headers

client = Portkey(
    provider="mistral-ai",
    Authorization="***"  # Bearer token format
)

Header	Description	Required
`Authorization`	Mistral API key (Bearer token)	Yes

Advanced Features

Function Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto"
)

if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")

JSON Mode

Force JSON output:

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[{
        "role": "user",
        "content": "List 3 French cities with their populations"
    }],
    response_format={"type": "json_object"}
)

import json
result = json.loads(response.choices[0].message.content)
print(result)

System Prompts

response = client.chat.completions.create(
    model="mistral-large-latest",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful Python programming expert. Always provide working code with explanations."
        },
        {
            "role": "user",
            "content": "How do I read a CSV file?"
        }
    ]
)

Fill-in-the-Middle (FIM)

Special mode for code completion:

client = Portkey(
    provider="mistral-ai",
    Authorization="***",
    mistral_fim_completion="true"  # Enable FIM mode
)

response = client.completions.create(
    model="codestral-latest",
    prompt="def fibonacci(n):\n    # Complete this function\n    "
)

print(response.choices[0].text)

Embeddings

response = client.embeddings.create(
    model="mistral-embed",
    input="Mistral AI provides European-hosted AI models"
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")

Batch embeddings:

response = client.embeddings.create(
    model="mistral-embed",
    input=[
        "First document",
        "Second document",
        "Third document"
    ]
)

for i, item in enumerate(response.data):
    print(f"Document {i}: {len(item.embedding)} dimensions")

Fallback Configuration

Fallback to OpenAI:

config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "mistral-ai",
            "api_key": "***",
            "override_params": {"model": "mistral-large-latest"}
        },
        {
            "provider": "openai",
            "api_key": "sk-***",
            "override_params": {"model": "gpt-4o"}
        }
    ]
}

client = Portkey().with_options(config=config)

Load Balancing

Balance between Mistral models:

config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {
            "provider": "mistral-ai",
            "api_key": "***",
            "override_params": {"model": "mistral-large-latest"},
            "weight": 0.3
        },
        {
            "provider": "mistral-ai",
            "api_key": "***",
            "override_params": {"model": "mistral-small-latest"},
            "weight": 0.7
        }
    ]
}

client = Portkey().with_options(config=config)

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="mistral-large-latest",
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except AuthenticationError as e:
    print(f"Invalid API key: {e}")
except APIError as e:
    print(f"API error: {e}")

Best Practices

Use latest versions - Model IDs with “latest” get automatic updates
Leverage function calling - Mistral has strong tool use capabilities
Try JSON mode - For structured outputs
Use Codestral - For code-specific tasks
Consider Small for volume - Cost-effective for simple tasks
Enable streaming - Better user experience
Use embeddings - For semantic search and RAG
System prompts - Guide behavior consistently

Context Windows

Model	Context Window	Notes
mistral-large-latest	128K tokens	Full documents
mistral-medium-latest	32K tokens	Standard documents
mistral-small-latest	32K tokens	Standard documents
open-mistral-nemo	128K tokens	Long context open model

European Data Residency

Mistral AI is headquartered in France and offers EU data residency:

GDPR compliant by default
European infrastructure (Paris, Frankfurt)
Data sovereignty for EU customers
No data training on customer inputs

Pricing

Mistral offers competitive pricing with open models:

Mistral Pricing

View detailed pricing for all Mistral models

Function Calling

Advanced function calling

Fallbacks

Fallback configurations

Code Generation

Using Codestral

JSON Mode

Structured outputs

Overview

Major Providers

Specialized Providers

Overview

Supported Features

Quick Start

Chat Completions

Streaming

Available Models

Commercial Models

Open-Weight Models

Specialized Models

Configuration Options

Headers

Advanced Features

Function Calling

JSON Mode

System Prompts

Fill-in-the-Middle (FIM)

Embeddings

Fallback Configuration

Load Balancing

Error Handling

Best Practices

Context Windows

European Data Residency

Pricing

Mistral Pricing

Function Calling

Fallbacks

Code Generation

JSON Mode

Build docs developers (and LLMs) love

Overview

Major Providers

Specialized Providers

​Overview

​Supported Features

​Quick Start

​Chat Completions

​Streaming

​Available Models

​Commercial Models

​Open-Weight Models

​Specialized Models

​Configuration Options

​Headers

​Advanced Features

​Function Calling

​JSON Mode

​System Prompts

​Fill-in-the-Middle (FIM)

​Embeddings

​Fallback Configuration

​Load Balancing

​Error Handling

​Best Practices

​Context Windows

​European Data Residency

​Pricing

Mistral Pricing

​Related Resources

Function Calling

Fallbacks

Code Generation

JSON Mode

Build docs developers (and LLMs) love

Overview

Supported Features

Quick Start

Chat Completions

Streaming

Available Models

Commercial Models

Open-Weight Models

Specialized Models

Configuration Options

Headers

Advanced Features

Function Calling

JSON Mode

System Prompts

Fill-in-the-Middle (FIM)

Embeddings

Fallback Configuration

Load Balancing

Error Handling

Best Practices

Context Windows

European Data Residency

Pricing

Related Resources