[Live] run_live() silently terminates on max_llm_calls exhaustion — no event, no exception, no resumption path



## 🔴 Required Information

**Describe the Bug:**

When `run_live()` exhausts the `max_llm_calls` limit (e.g., 25), the async generator completes silently — no exception is raised, no `session_ended` event is yielded, and no indication is given to the application about *why* the session stopped producing events. The session becomes a dead-end: the model stops responding, but the WebSocket to Gemini may still be open, and the `LiveRequestQueue` continues accepting audio.

This is fundamentally different from a WebSocket timeout (code 1000/1011), where session resumption handles can reconnect the session. With `max_llm_calls` exhaustion:

1. **No event is emitted** — the `async for event in runner.run_live(...)` loop simply ends.
2. **No exception is raised** — unlike connection drops, there is no `APIError` to catch.
3. **Session resumption does not apply** — the connection didn't drop; the ADK's internal call counter simply reached the limit.
4. **The session state is intact** but the session itself cannot be continued.

The only workaround we've found is to detect that `run_live()` has ended, notify the frontend via a custom `session_ended` WebSocket message, and have the frontend create an entirely new session (new `session_id`, re-initialise state).

**Steps to Reproduce:**

1. Configure a live agent with tools that trigger multiple LLM calls per user turn (e.g., a data retrieval tool that requires schema lookup → query → retry → response).
2. Set `RunConfig(max_llm_calls=25, streaming_mode=StreamingMode.BIDI, ...)`.
3. Start a `run_live()` session and interact normally.
4. After several multi-tool turns, the 25-call limit is reached.
5. `run_live()` generator completes — no more events are yielded.
6. The user's next audio input gets no response. The session is dead.

**Expected Behavior:**

At minimum, one of the following:

1. **Yield a terminal event** when `max_llm_calls` is exhausted — e.g., an `Event` with a `session_ended` or `max_llm_calls_exhausted` field, so the application can distinguish this from a normal generator completion.
2. **Allow the session to be continued** by calling `run_live()` again with the same session, resetting the call counter (similar to how session resumption works for WebSocket timeouts).
3. **Raise a specific exception** (e.g., `MaxLlmCallsExhaustedError`) so the application can handle it distinctly from connection errors.

**Observed Behavior:**

The `async for event in runner.run_live(...)` loop silently ends. No terminal event, no exception. The application has no way to distinguish "max_llm_calls exhausted" from "model chose not to respond" or "generator completed normally after a clean session."

**Our Current Workaround:**

We detect that the generator completed and send a custom WebSocket message to the frontend:

```python
async for event in runner.run_live(
    user_id=user_id,
    session_id=session_id,
    live_request_queue=live_request_queue,
    run_config=run_config,
):
    # ... handle events ...

# Generator completed — no way to know WHY.
# We assume max_llm_calls exhaustion and tell the frontend
# to create a new session.
await websocket.send_text(json.dumps({
    "type": "session_ended",
    "message": "Session ended. Please start a new session.",
}))
```

The frontend then creates a brand-new session with a new `session_id` and re-connects. This works, but:

- **All conversation context is lost** (unless we manually copy `session.state` to the new session).
- **The user experiences an interruption** — the voice session drops and restarts.
- **We can't distinguish** `max_llm_calls` exhaustion from other generator-completion scenarios.

**Questions:**

1. Is creating a new session the **only** workaround when `max_llm_calls` is exhausted?
2. Can `run_live()` be called again on the **same session** (same `session_id`) after `max_llm_calls` exhaustion to reset the counter and continue the conversation?
3. Would the team consider yielding a terminal event or raising a specific exception when the call limit is hit?

**Environment Details:**

- ADK Library Version: 1.29.0
- Desktop OS: Linux (Cloud Run) / Windows (local dev)
- Python Version: 3.11

**Model Information:**

- Are you using LiteLLM: No
- Which model: `gemini-live-2.5-flash-native-audio` (Vertex AI)

---



**Related Issues:**

- #4996 — Session resumption reconnection loop (covers WebSocket timeouts, not `max_llm_calls`)
- #4357 — Expose `live_session_resumption_update` in events (cross-connection resumption)
- #4587 — Intermittent WebSocket closure in `run_live`

**How often has this issue occurred?:**

Always (100%) — deterministic once the call counter reaches the limit.

---


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Live] run_live() silently terminates on max_llm_calls exhaustion — no event, no exception, no resumption path #5494

🔴 Required Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Live] run_live() silently terminates on max_llm_calls exhaustion — no event, no exception, no resumption path #5494

Description

🔴 Required Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions