I/O Interfaces
Arrow’s I/O interfaces provide abstract interfaces for reading and writing data from various sources.
Core Interfaces
FileInterface
Base interface for all file-like objects.
Closes the stream cleanly. For writable streams, attempts to flush pending data before releasing resources. After Close(), closed() returns true
Closes the stream asynchronously. By default, submits synchronous Close() to the I/O thread pool
Closes the stream abruptly without guaranteeing pending data is flushed. Merely releases underlying resources
Returns the current position in the stream
Returns whether the stream is closed
Returns the file mode (READ, WRITE, or READWRITE)
Interface for sequential reading.
class InputStream : virtual public FileInterface, virtual public Readable
Maximum number of bytes to read
Reads at most nbytes from the current position into out. Returns the number of bytes actually read
Read
Result<std::shared_ptr<Buffer>>
Maximum number of bytes to read
Reads at most nbytes from the current position. May avoid a memory copy in some cases. Returns a Buffer containing the data
Advances or skips the stream by the indicated number of bytes
Maximum number of bytes to peek
Returns zero-copy string_view to upcoming bytes without modifying stream position. View becomes invalid after any operation on the stream. May return NotImplemented
Returns true if InputStream is capable of zero-copy Buffer reads
ReadMetadata
Result<std::shared_ptr<const KeyValueMetadata>>
Reads and returns stream metadata. If not supported, returns empty metadata or nullptr
ReadMetadataAsync
Future<std::shared_ptr<const KeyValueMetadata>>
I/O context for async operations
Reads stream metadata asynchronously
RandomAccessFile
Interface for random access reading.
class RandomAccessFile : public InputStream, public Seekable
Returns the total file size in bytes. Does not read or move the current position, so is safe to call concurrently
Seeks to the specified position in the file
Maximum number of bytes to read
Reads data from the given position. Thread-safe and does not affect current file position. Returns the number of bytes read
ReadAt
Result<std::shared_ptr<Buffer>>
Maximum number of bytes to read
Reads data from the given position. Returns a Buffer containing the data
ReadAsync
Future<std::shared_ptr<Buffer>>
I/O context for async operations
Reads data asynchronously from the given position
ReadManyAsync
std::vector<Future<std::shared_ptr<Buffer>>>
I/O context for async operations
ranges
const std::vector<ReadRange>&
required
Ranges to read
Requests multiple reads at once. The filesystem may optimize by coalescing or parallelizing reads. Returns one future per input range
ranges
const std::vector<ReadRange>&
required
Ranges that will be read soon
Hints that the given ranges may be read soon. Some implementations might prefetch data. No guarantee is made
Output Streams
OutputStream
Interface for sequential writing.
class OutputStream : virtual public FileInterface, public Writable
Writes the given data to the stream. Always processes bytes in full. Data may be written immediately, buffered, or written asynchronously
data
const std::shared_ptr<Buffer>&
required
Buffer containing data to write
Writes the given data to the stream. Since Buffer owns its memory, can avoid a copy if buffering is required
Flushes buffered bytes, if any
WritableFile
Interface for writable files with seeking.
class WritableFile : public OutputStream, public Seekable
Writes data at the specified position
I/O Context
IOContext
Provides context for I/O operations including executor and memory pool.
Memory pool for allocations
Executor for async operations
Constructs an IOContext
Returns the executor for async operations
Returns the application-specific ID forwarded to executor task submissions
Returns the cancellation token
ReadRange
Specifies a range of bytes to read.
struct ReadRange {
int64_t offset;
int64_t length;
}
Convenience Functions
Creates an iterator over fixed-size blocks from an input stream.
Result<Iterator<std::shared_ptr<Buffer>>> MakeInputStreamIterator(
std::shared_ptr<InputStream> stream,
int64_t block_size)
stream
std::shared_ptr<InputStream>
required
Input stream to iterate over
The iterator yields fixed-size blocks on each Next() call, except the last block which may be smaller. Returns nullptr when end of stream is reached.