How We Built Our Multi-Model Sandbox
The Abstrakt Sandbox lets you generate content with multiple AI models simultaneously. Here's how we engineered it.
The Challenge
Users wanted to:
Architecture Overview
Request Flow
1. User submits prompt + selected models
2. Backend spawns parallel jobs
3. Each job streams progress via SSE
4. Results aggregate in real-time
5. Frontend renders as they arrive
Key Components
Job Queue: We use an in-memory queue with Redis fallback for production scale. Jobs are processed in parallel, respecting per-model rate limits.
const results = await Promise.all(
models.map(model =>
subscribeModel(model.endpoint, input)
)
);Server-Sent Events: Real-time updates without WebSocket complexity.
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
for await (const update of jobStream) {
controller.enqueue(
encoder.encode(`data: ${JSON.stringify(update)}\n\n`)
);
}
}
});Performance Optimizations
Challenges We Solved
Rate Limit Coordination
Different providers have different limits. We implemented adaptive throttling that backs off gracefully.
Error Isolation
One failing model shouldn't break the whole batch. Each job runs in isolation with independent error handling.
Cost Tracking
Real-time credit deduction with rollback on failure. Users see costs as they generate.
Try It Yourself
Visit the Sandbox to experience multi-model generation firsthand!