Overview
IHP OpenAI provides a type-safe Haskell interface to OpenAI’s API (and compatible APIs like Google’s Gemini). It supports both streaming and non-streaming completions with automatic retries.
Key Features:
- Type-safe API client for OpenAI and compatible services
- Streaming completion support with Server-Sent Events
- Automatic retry logic with exponential backoff
- Tool/function calling support
- JSON response format mode
Installation
Add ihp-openai to your project’s default.nix:
haskellDeps = p: with p; [
ihp-openai
];
Configuration
Config
API configuration with base URL and API key:
data Config = Config
{ baseUrl :: !Text
, secretKey :: !Text
}
defaultConfig
Create a configuration for OpenAI:
defaultConfig :: Text -> Config
Configuration with OpenAI base URL (https://api.openai.com/v1)
Example:
let config = defaultConfig "sk-..."
-- Uses api.openai.com
Custom Providers
For Google Gemini or other providers:
let config = Config
{ baseUrl = googleBaseUrl -- "https://generativelanguage.googleapis.com/v1beta/openai"
, secretKey = "your-google-api-key"
}
Messages
Message
A message in the conversation:
data Message = Message
{ role :: !Role
, content :: !Text
, name :: !(Maybe Text)
, toolCallId :: !(Maybe Text)
, toolCalls :: ![ToolCall]
, cacheControl :: !(Maybe CacheControl)
}
data Role
= UserRole
| SystemRole
| AssistantRole
| ToolRole
Message Helpers
Convenience functions for creating messages:
userMessage :: Text -> Message
systemMessage :: Text -> Message
assistantMessage :: Text -> Message
toolMessage :: Text -> Message
Example:
let messages =
[ systemMessage "You are a helpful assistant."
, userMessage "What is the capital of France?"
]
Completion Requests
CompletionRequest
Configure a completion request:
data CompletionRequest = CompletionRequest
{ messages :: ![Message]
, model :: !Text
, maxTokens :: !(Maybe Int)
, temperature :: !(Maybe Double)
, presencePenalty :: !(Maybe Double)
, frequencePenalty :: !(Maybe Double)
, stream :: !Bool
, responseFormat :: !(Maybe ResponseFormat)
, tools :: ![Tool]
, reasoningEffort :: !(Maybe Text)
, parallelToolCalls :: !(Maybe Bool)
, extraHeaders :: [(Text, Text)]
}
newCompletionRequest
Default completion request:
newCompletionRequest :: CompletionRequest
Defaults:
model: "gpt-3.5-turbo"
stream: False
messages: []
- All optional fields:
Nothing
Example:
let request = newCompletionRequest
|> set #model "gpt-4"
|> set #messages [userMessage "Hello!"]
|> set #maxTokens (Just 100)
|> set #temperature (Just 0.7)
Control response format:
data ResponseFormat
= Text
| JsonObject
Example:
let request = newCompletionRequest
|> set #responseFormat (Just JsonObject)
|> set #messages [userMessage "Return a JSON object with fields: name, age"]
Simple Completions
fetchCompletion
Fetch a non-streaming completion:
fetchCompletion :: Config -> CompletionRequest -> IO Text
Completion request parameters
The generated completion text
Example:
import IHP.OpenAI
action GenerateAction = do
let config = defaultConfig "sk-..."
let request = newCompletionRequest
|> set #model "gpt-4"
|> set #messages
[ systemMessage "You are a helpful assistant."
, userMessage "Explain Haskell in one sentence."
]
response <- fetchCompletion config request
putStrLn response
-- "Haskell is a statically typed, purely functional programming language..."
Streaming Completions
streamCompletion
Stream a completion with real-time chunks:
streamCompletion ::
Config
-> CompletionRequest
-> IO () -- onStart callback
-> (CompletionChunk -> IO ()) -- chunk callback
-> IO [CompletionChunk]
Completion request (stream is automatically enabled)
Callback executed when streaming begins
Callback for each chunk received
All chunks received during streaming
Example:
action StreamAction = do
let config = defaultConfig "sk-..."
let request = newCompletionRequest
|> set #model "gpt-4"
|> set #messages [userMessage "Write a haiku about Haskell."]
chunks <- streamCompletion config request
(putStrLn "Starting stream...")
(\chunk -> do
let text = chunk.choices
|> mapMaybe (\choice -> choice.delta.content)
|> mconcat
putStr text
hFlush stdout
)
putStrLn "\nStream complete!"
CompletionChunk
A single chunk from a streaming response:
data CompletionChunk = CompletionChunk
{ id :: !Text
, choices :: [CompletionChunkChoice]
, created :: Int
, model :: !Text
, systemFingerprint :: !(Maybe Text)
, usage :: (Maybe Usage)
}
data CompletionChunkChoice = CompletionChunkChoice
{ delta :: !Delta
, finishReason :: !(Maybe FinishReason)
}
data Delta = Delta
{ content :: !(Maybe Text)
, toolCalls :: !(Maybe [ToolCall])
, role :: !(Maybe Role)
}
data FinishReason
= FinishReasonStop
| FinishReasonLength
| FinishReasonContentFilter
| FinishReasonToolCalls
Define a function the LLM can call:
data Tool = Function
{ description :: !(Maybe Text)
, name :: !Text
, parameters :: !(Maybe JsonSchema)
}
JsonSchema
Define parameter schemas:
data JsonSchema
= JsonSchemaObject ![Property]
| JsonSchemaString
| JsonSchemaInteger
| JsonSchemaNumber
| JsonSchemaArray !JsonSchema
| JsonSchemaEnum ![Text]
data Property = Property
{ propertyName :: !Text
, type_ :: !JsonSchema
, required :: !Bool
, description :: !(Maybe Text)
}
Example:
let getCurrentWeather = Function
{ name = "get_current_weather"
, description = Just "Get the current weather in a location"
, parameters = Just $ JsonSchemaObject
[ Property
{ propertyName = "location"
, type_ = JsonSchemaString
, required = True
, description = Just "The city and state, e.g. San Francisco, CA"
}
, Property
{ propertyName = "unit"
, type_ = JsonSchemaEnum ["celsius", "fahrenheit"]
, required = False
, description = Nothing
}
]
}
let request = newCompletionRequest
|> set #model "gpt-4"
|> set #tools [getCurrentWeather]
|> set #messages [userMessage "What's the weather in Boston?"]
response <- fetchCompletion config request
-- LLM will call the function with appropriate arguments
Function call from the LLM:
data ToolCall = FunctionCall
{ index :: !Int
, id :: !(Maybe Text)
, name :: !(Maybe Text)
, arguments :: !Text -- JSON string
}
Automatic Retries
Both fetchCompletion and streamCompletion include automatic retry logic:
- Retry policy: 10 retries with 50ms constant delay
- Retries on: Network errors, API errors, incomplete responses
- No retry on: Successful responses, invalid API keys
Example with manual retry control:
import qualified Control.Retry as Retry
-- Custom retry policy: exponential backoff
let customRetry = Retry.exponentialBackoff 1000000 <> Retry.limitRetries 5
-- Use fetchCompletionWithoutRetry for manual control
result <- Retry.retrying customRetry shouldRetry (\_ ->
Exception.try (fetchCompletionWithoutRetry config request)
)
Usage Tracking
Usage
Token usage information (only in streaming mode):
data Usage = Usage
{ promptTokens :: !Int
, completionTokens :: !Int
, totalTokens :: !Int
}
Example:
chunks <- streamCompletion config request onStart callback
let maybeUsage = chunks
|> lastMay
>>= (.usage)
case maybeUsage of
Just Usage { totalTokens } ->
putStrLn $ "Total tokens used: " <> show totalTokens
Nothing ->
putStrLn "Usage information not available"
Advanced Options
Temperature
Control randomness (0.0 to 2.0):
let creative = newCompletionRequest
|> set #temperature (Just 0.9) -- More random
let deterministic = newCompletionRequest
|> set #temperature (Just 0.1) -- More focused
Penalties
Control repetition:
let request = newCompletionRequest
|> set #presencePenalty (Just 0.6) -- Penalize topics already mentioned
|> set #frequencePenalty (Just 0.5) -- Penalize frequent tokens
Add custom HTTP headers:
let request = newCompletionRequest
|> set #extraHeaders
[ ("X-Custom-Header", "value")
, ("Organization", "org-123")
]
Error Handling
CompletionResult
Result type for non-streaming completions:
data CompletionResult
= CompletionResult { choices :: [Choice] }
| CompletionError { message :: !Text }
Errors are automatically thrown as exceptions. Catch them with:
import qualified Control.Exception.Safe as Exception
result <- Exception.try (fetchCompletion config request)
case result of
Right text ->
putStrLn text
Left (e :: Exception.SomeException) ->
putStrLn $ "Error: " <> displayException e
Best Practices
-
Use system messages for instructions:
[ systemMessage "You are a JSON API. Always respond with valid JSON."
, userMessage "Get user with id 123"
]
-
Stream for better UX: Use
streamCompletion for interactive applications
-
Set max tokens: Prevent runaway costs
set #maxTokens (Just 500)
-
Use tools for structured outputs: Better than asking for JSON in text
-
Handle rate limits: Built-in retry handles transient errors
See Also