What Are Virtual Threads?
Virtual Threads are a revolutionary feature introduced in Java 21 (JEP 444) that dramatically improves concurrency by allowing millions of lightweight threads to run on a small number of OS threads. Traditional Java threads (platform threads) are heavyweight:- Each thread maps 1:1 to an OS thread
- Typical limit: ~few thousand threads before performance degrades
- Stack size: ~1MB per thread
- Context switching overhead
- Millions of virtual threads can run on a handful of carrier threads
- Stack size: ~few KB, grows dynamically
- Managed by the JVM, not the OS
- Near-zero blocking cost
Why Virtual Threads Matter for HandsAI
HandsAI is a bridging API that makes potentially thousands of external HTTP calls per second when AI agents execute tools. Virtual threads are a perfect fit because:1. High Concurrency
AI agents may call multiple tools in parallel:- Agent needs to search GitHub issues AND send an email AND check weather
- Traditional thread pools would queue requests or limit concurrency
- Virtual threads allow unlimited concurrent tool executions without thread starvation
2. Many External API Calls
Most of HandsAI’s work involves blocking I/O (waiting for HTTP responses):- GitHub API responds in 200ms
- Resend API responds in 150ms
- Tavily search takes 500ms
3. Simplified Code
No need for complex reactive programming or async/await patterns. HandsAI uses simple, synchronous code:4. GraalVM Native Image Compatibility
Virtual threads work seamlessly with GraalVM native compilation, maintaining HandsAI’s sub-1.5 second startup time.How Virtual Threads Are Enabled in Spring Boot 3.5
HandsAI uses Spring Boot 3.5.4, which has built-in virtual thread support. Enabling them is trivial:VirtualThreadConfig.java
- Creates a virtual thread executor
- Marks it as
@Primaryso Spring uses it by default - Replaces the default thread pool for all async operations
What Executors.newVirtualThreadPerTaskExecutor() Does
- Spawns a new virtual thread for every submitted task
- No thread pool, no queue, no limits
- Threads are created on-demand and garbage collected when done
Spring Boot Integration
Spring Boot automatically uses this executor for:@AsyncmethodsTaskExecutorbeans- Scheduled tasks (if configured)
- Web request handling (Tomcat/Jetty with virtual threads enabled)
Performance Benefits
1. Startup Time
HandsAI starts in under 1.5 seconds (GraalVM native image):- Zero thread pool initialization overhead
- No pre-warming of worker threads
- Lazy thread creation only when needed
2. Memory Usage
Comparison of memory overhead for 10,000 concurrent tool executions:| Thread Type | Stack Size | Total Memory |
|---|---|---|
| Platform Threads | ~1MB | ~10GB |
| Virtual Threads | ~few KB | ~50MB |
3. Throughput
Benchmark: Execute 1,000 API tools concurrently (each simulates 200ms network latency) Platform Threads (200 thread pool):- Throughput: ~1,000 requests/second
- Average latency: 200ms + queue time (~800ms total)
- Thread pool saturation: Yes
- Throughput: ~5,000 requests/second
- Average latency: 200ms (no queueing)
- Thread pool saturation: No
4. Resource Efficiency
With platform threads:Virtual Threads in HandsAI’s Architecture
Tool Execution Flow
- Validates the request
- Fetches authentication (may block on OAuth2 token fetch)
- Executes the HTTP call to the target API (blocks on network I/O)
- Returns the response
Concurrent Tool Execution
Example: AI agent executes 3 tools simultaneously:Tool Cache Concurrency
HandsAI usesConcurrentHashMap in ToolCacheManager:
Dynamic Token Manager Concurrency
DynamicTokenManager also uses ConcurrentHashMap:
Comparison with Traditional Thread Pools
Platform Thread Pool Approach
- Fixed Capacity: Only 200 concurrent requests, then queueing starts
- Tuning Complexity: Core pool size, max pool size, queue capacity need careful tuning
- Resource Waste: 200 threads × 1MB = 200MB even when idle
- Thread Starvation: If all threads are blocked on slow APIs, new requests queue
Virtual Thread Approach
- Unlimited Capacity: Millions of concurrent requests (limited only by memory)
- Zero Tuning: No pool sizes to configure
- Minimal Resources: ~1KB per virtual thread, created on-demand
- No Starvation: Blocking threads are cheap, carrier threads always available
Real-World Use Case
Imagine an AI agent executing a complex workflow:- Search for similar GitHub issues (500ms)
- Check if issue already exists (300ms)
- Create new issue (400ms)
- Send email notification (200ms)
- Log to analytics (100ms)
Sequential Execution (Traditional)
Parallel Execution (Virtual Threads)
Best Practices with Virtual Threads
Do:
Use blocking I/O freely: Don’t avoid blocking calls (HTTP, database, file I/O)
Spawn liberally: Create virtual threads without worrying about limits
Use
ConcurrentHashMap: Thread-safe collections work great with virtual threadsKeep code simple: No need for reactive streams or callbacks
Don’t:
GraalVM Native Image Compatibility
Virtual threads work seamlessly with GraalVM native compilation. HandsAI compiles to a native executable with virtual thread support:Native Image Startup with Virtual Threads
Monitoring Virtual Threads
You can observe virtual thread behavior with JVM tools:JFR (Java Flight Recorder)
- Virtual thread creation rates
- Carrier thread utilization
- Pinning events (when virtual threads can’t unmount)
JConsole / VisualVM
Virtual threads appear as regular threads but with names like:Future Enhancements
Structured Concurrency (JEP 428)
Future Java versions will add structured concurrency, making parallel tool execution even cleaner:Summary
HandsAI leverages Java 21 Virtual Threads to:- Handle unlimited concurrency: Millions of tool executions without thread pools
- Simplify code: Blocking I/O is cheap and natural
- Maintain fast startup: Sub-1.5 second boot time (GraalVM native)
- Optimize memory: ~1KB per virtual thread vs. ~1MB for platform threads
- Scale effortlessly: No thread pool tuning or reactive complexity
Next Steps
MCP Protocol
See how MCP leverages virtual threads for tool execution
Tool Registry
Understand concurrent tool caching
Authentication
Learn about concurrent OAuth2 token fetching
Performance Tuning
Advanced performance optimization tips