-
Notifications
You must be signed in to change notification settings - Fork 0
Add object pooling for request/response DTOs #4
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Implement object pooling for frequently allocated request and response objects to reduce GC pressure and improve performance in high-throughput scenarios.
Current Behavior
Every API call creates new objects:
var request = new ChatCompletionRequest
{
Model = "model",
Messages = new List<ChatMessage> { ... } // New allocation
};
// Request object becomes garbage after useProposed Solution
Use Microsoft.Extensions.ObjectPool or a custom pooling implementation:
public class ChatCompletionRequestPool
{
private static readonly ObjectPool<ChatCompletionRequest> Pool =
ObjectPool.Create<ChatCompletionRequest>();
public static ChatCompletionRequest Rent() => Pool.Get();
public static void Return(ChatCompletionRequest request)
{
request.Reset(); // Clear for reuse
Pool.Return(request);
}
}
// Usage with IDisposable pattern
using var request = ChatCompletionRequestPool.Rent();
request.Model = "model";
request.Messages.Add(ChatMessage.User("Hello"));
var response = await client.CreateChatCompletionAsync(request);Alternative: Struct-Based DTOs
For simple requests, consider struct-based DTOs to avoid heap allocations entirely:
public readonly struct ChatCompletionRequestBuilder
{
// Build request without allocations
}Expected Benefits
- ~10% memory reduction in high-throughput scenarios
- Reduced GC pauses - Fewer Gen0/Gen1 collections
- Lower latency variance - More consistent performance
Priority
🟢 P2 - Medium Impact
Considerations
- Pooling adds complexity - only worth it for high-frequency usage
- Need to ensure proper reset/cleanup of pooled objects
- Consider making this opt-in for users who need it
- Thread-safety must be handled correctly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request