Updates

Smarter Context Windows

June 20, 2026

We’ve significantly expanded and refined how Optron AI handles context across long or multi-turn conversations. This update makes the model far more capable of maintaining coherence when working through complex, multi-step tasks. Here is what has changed:

Context window capacity has been extended, allowing the model to retain and reason over significantly longer conversation histories without degradation in output quality.
Relevance weighting across the context has been improved, so earlier instructions and key facts are no longer silently deprioritised as conversations grow.
Memory stitching between tool calls and model responses is now more reliable, reducing cases where the model loses track of prior outcomes mid-task.
Prompt compression has been optimised to prioritise high-signal tokens, keeping performance strong even when the context is dense with structured data.

These changes are live for all plans. No action is required on your end — existing workflows will benefit automatically when the next session begins.

Faster Inference Engine

June 19, 2026

This release focuses on substantial infrastructure improvements to how Optron AI generates responses. The goal was to reduce latency across all workloads without any trade-off in output quality. Key highlights from this update include:

Average response latency has been reduced by up to 40% on standard prompt lengths, with the most significant gains seen on prompts between 500 and 2000 tokens.
Streaming output now begins faster, meaning the first token reaches the client sooner — a noticeable improvement for real-time UI applications built on the API.
Batch processing throughput has been increased by roughly 2x on high-volume workloads, reducing queue wait times during peak usage periods.
Model warmup time on cold starts has been shortened, which benefits users on auto-scaling deployments that spin up new instances frequently.

These performance improvements are rolling out across all regions this week. Benchmarks and detailed latency breakdowns are available in the developer documentation.

Updates

Smarter Context Windows

Smarter Context Windows

Faster Inference Engine

Faster Inference Engine

Ready to build intelligent workflows that works?

Ready to build intelligent workflows that works?

Ready to build intelligent workflows that works?

Navigate

Socials

Navigate

Socials

Navigate

Socials