Skip to main content
This integration connects Fireworks AI’s fast inference platform to LangChain.

Installation

pip install -U langchain-fireworks

Setup

Set your Fireworks API key as an environment variable:
export FIREWORKS_API_KEY="your-api-key"
Get your API key from fireworks.ai.

Usage

from langchain_fireworks import ChatFireworks

model = ChatFireworks(
    model="accounts/fireworks/models/llama-v3p3-70b-instruct",
    temperature=0,
    max_tokens=None,
)

messages = [
    ("system", "You are a helpful assistant."),
    ("human", "What is the capital of France?"),
]

response = model.invoke(messages)
print(response.content)

Streaming

for chunk in model.stream(messages):
    print(chunk.content, end="")

API Reference

ChatFireworks

model
str
required
Model name to use (e.g., accounts/fireworks/models/llama-v3p3-70b-instruct).
temperature
float | None
default:"None"
Sampling temperature. Controls randomness in generation.
max_tokens
int | None
default:"None"
Maximum number of tokens to generate.
timeout
float | tuple[float, float] | None
default:"None"
Timeout for requests to Fireworks completion API.
api_key
str
required
Fireworks API key. Automatically read from FIREWORKS_API_KEY environment variable if not provided.
base_url
str | None
default:"None"
Base URL path for API requests. Leave blank unless using a proxy or service emulator. Reads from FIREWORKS_API_BASE if not provided.
streaming
bool
default:"False"
Whether to stream the results or not.
n
int
default:"1"
Number of chat completions to generate for each prompt.
model_kwargs
dict
default:"{}"
Additional model parameters valid for create call not explicitly specified.

Supported Models

Fireworks hosts a wide variety of open-source models:
  • Llama 3.3 70B: Meta’s latest high-performance model
  • Mixtral MoE 8x22B: Large mixture-of-experts model
  • Qwen 2.5: Alibaba’s multilingual models
  • DeepSeek: Reasoning and chat models
  • Gemma 2: Google’s efficient models
See fireworks.ai/models for the full catalog.

Features

  • Fast inference with optimized serving
  • Function/tool calling
  • JSON mode
  • Streaming
  • Async support
  • Fine-tuning support
  • Custom model deployment
Fireworks AI specializes in fast, cost-effective inference for open-source models. They offer competitive pricing and support for custom fine-tuned models.

Build docs developers (and LLMs) love