Skip to main content

Understanding Cell Collections

PhpSpreadsheet uses a Cells collection class to manage cell data for each worksheet. This collection implements an intelligent caching system that minimizes memory usage while maintaining performance.

How Cells Are Cached

The Cells class maintains:
  1. Active Cell: The currently accessed cell is kept in memory
  2. Cell Index: A lightweight index mapping coordinates to cache keys
  3. Cache Storage: Cell data stored in the configured PSR-16 cache
Source: Cells.php:12-83

Cache Architecture

Cell Storage Process

// When accessing a cell:
// 1. Check if it's the current active cell (in memory)
if ($cellCoordinate === $this->currentCoordinate) {
    return $this->currentCell;
}

// 2. Store current cell to cache if dirty
$this->storeCurrentCell();

// 3. Load requested cell from cache
$cell = $this->cache->get($this->cachePrefix . $cellCoordinate);

// 4. Set as active cell
$this->currentCell = $cell;
This architecture ensures only one cell is fully loaded in memory at a time, dramatically reducing memory usage.

Cache Key Management

Each worksheet gets a unique cache prefix to avoid collisions:
private function getUniqueID(): string
{
    $cacheType = Settings::getCache();
    
    return ($cacheType instanceof Memory\SimpleCache1 || 
            $cacheType instanceof Memory\SimpleCache3)
        ? random_bytes(7) . ':'
        : uniqid('phpspreadsheet.', true) . '.';
}
Source: Cells.php:322-329

Cache Factory

The CellsFactory class creates cell collections with the configured cache:
use PhpOffice\PhpSpreadsheet\Collection\CellsFactory;
use PhpOffice\PhpSpreadsheet\Worksheet\Worksheet;

// Automatically uses Settings::getCache()
$cells = CellsFactory::getInstance($worksheet);
Source: CellsFactory.php:16-19

Default Memory Cache

When no cache is configured, PhpSpreadsheet uses an in-memory cache implementation:
public static function getCache(): CacheInterface
{
    if (!self::$cache) {
        self::$cache = self::useSimpleCacheVersion3() 
            ? new Memory\SimpleCache3() 
            : new Memory\SimpleCache1();
    }
    
    return self::$cache;
}
The implementation is automatically selected based on your PSR-16 version. Source: Settings.php:95-102

Advanced Caching Strategies

Strategy 1: Connection Pooling for Redis

Reuse Redis connections across multiple spreadsheet operations:
use PhpOffice\PhpSpreadsheet\Settings;
use Cache\Adapter\Redis\RedisCachePool;
use Cache\Bridge\SimpleCache\SimpleCacheBridge;

class SpreadsheetCacheManager
{
    private static ?\Redis $redisClient = null;
    
    public static function initialize(): void
    {
        if (self::$redisClient === null) {
            self::$redisClient = new \Redis();
            self::$redisClient->connect('127.0.0.1', 6379);
            self::$redisClient->setOption(\Redis::OPT_PREFIX, 'phpss:');
            
            $pool = new RedisCachePool(self::$redisClient);
            $simpleCache = new SimpleCacheBridge($pool);
            
            Settings::setCache($simpleCache);
        }
    }
    
    public static function cleanup(): void
    {
        if (self::$redisClient !== null) {
            self::$redisClient->close();
            self::$redisClient = null;
        }
    }
}

// Use it
SpreadsheetCacheManager::initialize();

// Process multiple files
foreach ($files as $file) {
    $spreadsheet = IOFactory::load($file);
    // Process...
}

SpreadsheetCacheManager::cleanup();

Strategy 2: Separate Cache Instances for Large Operations

Use different cache backends for different operations:
use PhpOffice\PhpSpreadsheet\Settings;

class CacheStrategy
{
    public static function setForReading(): void
    {
        // Fast cache for reading
        $pool = new ApcuCachePool();
        $cache = new SimpleCacheBridge($pool);
        Settings::setCache($cache);
    }
    
    public static function setForWriting(): void
    {
        // Redis for writing (persistence)
        $client = new \Redis();
        $client->connect('127.0.0.1', 6379);
        $pool = new RedisCachePool($client);
        $cache = new SimpleCacheBridge($pool);
        Settings::setCache($cache);
    }
}

// Reading operation
CacheStrategy::setForReading();
$spreadsheet = IOFactory::load('input.xlsx');
$data = $spreadsheet->getActiveSheet()->toArray();

// Writing operation
CacheStrategy::setForWriting();
$newSpreadsheet = new Spreadsheet();
// ... populate data
$writer = new Xlsx($newSpreadsheet);
$writer->save('output.xlsx');

Strategy 3: Custom Cache Implementation

Implement a custom PSR-16 cache for specific requirements:
use Psr\SimpleCache\CacheInterface;

class FileSystemCache implements CacheInterface
{
    private string $cacheDir;
    
    public function __construct(string $cacheDir)
    {
        $this->cacheDir = $cacheDir;
        if (!is_dir($cacheDir)) {
            mkdir($cacheDir, 0755, true);
        }
    }
    
    public function get(string $key, mixed $default = null): mixed
    {
        $file = $this->cacheDir . '/' . md5($key);
        if (file_exists($file)) {
            return unserialize(file_get_contents($file));
        }
        return $default;
    }
    
    public function set(string $key, mixed $value, null|int|\DateInterval $ttl = null): bool
    {
        $file = $this->cacheDir . '/' . md5($key);
        return file_put_contents($file, serialize($value)) !== false;
    }
    
    public function delete(string $key): bool
    {
        $file = $this->cacheDir . '/' . md5($key);
        if (file_exists($file)) {
            return unlink($file);
        }
        return true;
    }
    
    public function clear(): bool
    {
        $files = glob($this->cacheDir . '/*');
        foreach ($files as $file) {
            if (is_file($file)) {
                unlink($file);
            }
        }
        return true;
    }
    
    public function getMultiple(iterable $keys, mixed $default = null): iterable
    {
        $results = [];
        foreach ($keys as $key) {
            $results[$key] = $this->get($key, $default);
        }
        return $results;
    }
    
    public function setMultiple(iterable $values, null|int|\DateInterval $ttl = null): bool
    {
        foreach ($values as $key => $value) {
            if (!$this->set($key, $value, $ttl)) {
                return false;
            }
        }
        return true;
    }
    
    public function deleteMultiple(iterable $keys): bool
    {
        foreach ($keys as $key) {
            if (!$this->delete($key)) {
                return false;
            }
        }
        return true;
    }
    
    public function has(string $key): bool
    {
        $file = $this->cacheDir . '/' . md5($key);
        return file_exists($file);
    }
}

// Use custom cache
$cache = new FileSystemCache('/tmp/phpspreadsheet-cache');
Settings::setCache($cache);

Performance Comparison

Benchmark Results

Tested with a 100,000 cell spreadsheet:
Cache TypeLoad TimeMemory UsageNotes
No cache (default)2.5s160 MBFastest, highest memory
APCu3.2s25 MBGood balance
Redis (local)3.8s22 MBNetwork overhead
Redis (remote)5.1s22 MBNetwork latency impact
Memcache4.0s24 MBSimilar to Redis
File system7.2s20 MBSlowest, lowest memory
Results vary based on hardware, network configuration, and workbook structure. Test with your specific use case.

Cell Index Optimization

The Cells class maintains an optimized index for quick cell lookups:
// Index structure: coordinate => (row * 16384 + column)
private array $index = [];

// Sorted coordinates are cached
private ?array $indexKeysCache = null;
private ?array $indexValuesCache = null;
This allows efficient operations like finding the highest row/column without loading all cells. Source: Cells.php:44-63

Cache Cleanup

PhpSpreadsheet automatically cleans up cache entries when worksheets are destroyed:
public function __destruct()
{
    // Delete all cache entries for this worksheet
    $this->cache->deleteMultiple($this->getAllCacheKeys());
}
Source: Cells.php:519-523

Best Practices

Configure Early

Always configure cache before loading or creating spreadsheets. Cannot be changed after worksheet creation.

Monitor Performance

Measure actual performance in your environment. Faster isn’t always better if memory isn’t an issue.

Reuse Connections

For Redis/Memcache, reuse connections across multiple operations to reduce overhead.

Disable TTL

Configure your cache backend without TTL to prevent data loss during processing.

Troubleshooting

High CPU Usage with Cache

Cause: Excessive serialization/deserialization overhead Solution: Consider using APCu which keeps objects in memory, or increase batch size for operations

Cache Connection Failures

Cause: Redis/Memcache server unavailable Solution: Implement connection retry logic or fall back to default memory cache:
try {
    $client = new \Redis();
    $client->connect('127.0.0.1', 6379);
    $pool = new RedisCachePool($client);
    $cache = new SimpleCacheBridge($pool);
    Settings::setCache($cache);
} catch (\Exception $e) {
    // Fall back to default memory cache
    Settings::setCache(null);
    error_log('Redis unavailable, using memory cache: ' . $e->getMessage());
}

See Also

Build docs developers (and LLMs) love