Skip to main content

Automatic chat compression

If the chat with the agent becomes too large, there's a risk of hitting the context window limit: the model's performance decreases, while its cost and response time increase. We have implemented automatic chat compression that reduces the chat size automatically when it becomes too large. It's triggered automatically and doesn't make additional requests to LLM, so it doesn't waste your time and tokens.

You can configure when automatic compression should be triggered and manually initiate compression at any time directly from the chat interface, as shown in the video.