Skip to main content
The OpenAI Ruby SDK provides flexible options for uploading files to the API. You can use file paths, raw contents, or specialized objects to control upload behavior.

Using Pathname

The Pathname class is recommended for file uploads as it sends the filename and avoids loading large files into memory:
require "openai"
require "pathname"

client = OpenAI::Client.new

# Use Pathname to send the filename and avoid paging a large file into memory
file_object = client.files.create(
  file: Pathname("input.jsonl"),
  purpose: "fine-tune"
)

puts(file_object.id)
Pathname is the recommended approach as it:
  • Automatically includes the filename in the upload
  • Streams the file without loading it entirely into memory
  • Enables automatic retries on failure

Using Raw File Contents

You can pass raw file contents directly as a string:
# Read file contents into memory
file_object = client.files.create(
  file: File.read("input.jsonl"),
  purpose: "fine-tune"
)

puts(file_object.id)
Reading files into memory is not recommended for large files as it can consume significant memory.

Using StringIO

For in-memory file-like objects, use StringIO:
require "stringio"

# Create file content in memory
content = <<~JSONL
  {"prompt": "What is 2+2?", "completion": "4"}
  {"prompt": "What is the capital of France?", "completion": "Paris"}
JSONL

file_object = client.files.create(
  file: StringIO.new(content),
  purpose: "fine-tune"
)

puts(file_object.id)

Using FilePart for Custom Control

The OpenAI::FilePart class gives you control over filename and content type:
require "pathname"

# Control the filename and content type
image = OpenAI::FilePart.new(
  Pathname('dog.jpg'),
  content_type: 'image/jpeg'
)

edited = client.images.edit(
  prompt: "make this image look like a painting",
  model: "gpt-image-1",
  size: '1024x1024',
  image: image
)

puts(edited.data.first)

FilePart with StringIO

require "stringio"

image_data = File.binread("photo.png")
image = OpenAI::FilePart.new(
  StringIO.new(image_data),
  filename: "photo.png",
  content_type: "image/png"
)

response = client.images.edit(
  image: image,
  prompt: "Add a sunset in the background",
  model: "gpt-image-1",
  size: "1024x1024"
)

Using Raw IO Descriptors

You can pass raw IO descriptors, but this disables retries:
File.open("input.jsonl", "rb") do |file|
  file_object = client.files.create(
    file: file,
    purpose: "fine-tune"
  )

  puts(file_object.id)
end
Raw IO descriptors disable automatic retries because the SDK cannot determine if the descriptor is a file (which can be rewound) or a pipe (which cannot).

File Upload Examples

require "pathname"

# Upload training data for fine-tuning
training_file = client.files.create(
  file: Pathname("training_data.jsonl"),
  purpose: "fine-tune"
)

validation_file = client.files.create(
  file: Pathname("validation_data.jsonl"),
  purpose: "fine-tune"
)

# Use in fine-tuning job
job = client.fine_tuning.jobs.create(
  training_file: training_file.id,
  validation_file: validation_file.id,
  model: "gpt-4o-2024-08-06"
)

Supported File Types

Different endpoints support different file types:
  • Format: JSONL (JSON Lines)
  • Purpose: "fine-tune"
  • Max size: Check API documentation for current limits
  • Formats: PNG, JPEG, WEBP
  • Models: Image generation and editing
  • Max size: 4MB per image
  • Formats: MP3, MP4, MPEG, MPGA, M4A, WAV, WEBM
  • Models: Whisper (transcription/translation)
  • Max size: 25MB

Best Practices

1

Use Pathname for files on disk

This provides the best performance and reliability, especially for large files.
2

Use FilePart when you need custom metadata

Explicitly set content type and filename when the defaults aren’t appropriate.
3

Avoid loading large files into memory

Use streaming approaches (Pathname, IO) rather than File.read for large files.
4

Validate file formats before upload

Ensure your files match the expected format for the endpoint to avoid API errors.

Error Handling

require "pathname"

begin
  file_object = client.files.create(
    file: Pathname("training.jsonl"),
    purpose: "fine-tune"
  )
  puts "Uploaded: #{file_object.id}"
rescue OpenAI::Errors::APIError => e
  puts "Upload failed: #{e.message}"
  puts "Status: #{e.status}" if e.respond_to?(:status)
rescue Errno::ENOENT
  puts "File not found"
rescue StandardError => e
  puts "Unexpected error: #{e.message}"
end

Build docs developers (and LLMs) love