π΄ Required Information
Describe the Bug:
When run_live() exhausts the max_llm_calls limit (e.g., 25), the async generator completes silently β no exception is raised, no session_ended event is yielded, and no indication is given to the application about why the session stopped producing events. The session becomes a dead-end: the model stops responding, but the WebSocket to Gemini may still be open, and the LiveRequestQueue continues accepting audio.
This is fundamentally different from a WebSocket timeout (code 1000/1011), where session resumption handles can reconnect the session. With max_llm_calls exhaustion:
- No event is emitted β the
async for event in runner.run_live(...) loop simply ends.
- No exception is raised β unlike connection drops, there is no
APIError to catch.
- Session resumption does not apply β the connection didn't drop; the ADK's internal call counter simply reached the limit.
- The session state is intact but the session itself cannot be continued.
The only workaround we've found is to detect that run_live() has ended, notify the frontend via a custom session_ended WebSocket message, and have the frontend create an entirely new session (new session_id, re-initialise state).
Steps to Reproduce:
- Configure a live agent with tools that trigger multiple LLM calls per user turn (e.g., a data retrieval tool that requires schema lookup β query β retry β response).
- Set
RunConfig(max_llm_calls=25, streaming_mode=StreamingMode.BIDI, ...).
- Start a
run_live() session and interact normally.
- After several multi-tool turns, the 25-call limit is reached.
run_live() generator completes β no more events are yielded.
- The user's next audio input gets no response. The session is dead.
Expected Behavior:
At minimum, one of the following:
- Yield a terminal event when
max_llm_calls is exhausted β e.g., an Event with a session_ended or max_llm_calls_exhausted field, so the application can distinguish this from a normal generator completion.
- Allow the session to be continued by calling
run_live() again with the same session, resetting the call counter (similar to how session resumption works for WebSocket timeouts).
- Raise a specific exception (e.g.,
MaxLlmCallsExhaustedError) so the application can handle it distinctly from connection errors.
Observed Behavior:
The async for event in runner.run_live(...) loop silently ends. No terminal event, no exception. The application has no way to distinguish "max_llm_calls exhausted" from "model chose not to respond" or "generator completed normally after a clean session."
Our Current Workaround:
We detect that the generator completed and send a custom WebSocket message to the frontend:
async for event in runner.run_live(
user_id=user_id,
session_id=session_id,
live_request_queue=live_request_queue,
run_config=run_config,
):
# ... handle events ...
# Generator completed β no way to know WHY.
# We assume max_llm_calls exhaustion and tell the frontend
# to create a new session.
await websocket.send_text(json.dumps({
"type": "session_ended",
"message": "Session ended. Please start a new session.",
}))
The frontend then creates a brand-new session with a new session_id and re-connects. This works, but:
- All conversation context is lost (unless we manually copy
session.state to the new session).
- The user experiences an interruption β the voice session drops and restarts.
- We can't distinguish
max_llm_calls exhaustion from other generator-completion scenarios.
Questions:
- Is creating a new session the only workaround when
max_llm_calls is exhausted?
- Can
run_live() be called again on the same session (same session_id) after max_llm_calls exhaustion to reset the counter and continue the conversation?
- Would the team consider yielding a terminal event or raising a specific exception when the call limit is hit?
Environment Details:
- ADK Library Version: 1.29.0
- Desktop OS: Linux (Cloud Run) / Windows (local dev)
- Python Version: 3.11
Model Information:
- Are you using LiteLLM: No
- Which model:
gemini-live-2.5-flash-native-audio (Vertex AI)
Related Issues:
How often has this issue occurred?:
Always (100%) β deterministic once the call counter reaches the limit.
π΄ Required Information
Describe the Bug:
When
run_live()exhausts themax_llm_callslimit (e.g., 25), the async generator completes silently β no exception is raised, nosession_endedevent is yielded, and no indication is given to the application about why the session stopped producing events. The session becomes a dead-end: the model stops responding, but the WebSocket to Gemini may still be open, and theLiveRequestQueuecontinues accepting audio.This is fundamentally different from a WebSocket timeout (code 1000/1011), where session resumption handles can reconnect the session. With
max_llm_callsexhaustion:async for event in runner.run_live(...)loop simply ends.APIErrorto catch.The only workaround we've found is to detect that
run_live()has ended, notify the frontend via a customsession_endedWebSocket message, and have the frontend create an entirely new session (newsession_id, re-initialise state).Steps to Reproduce:
RunConfig(max_llm_calls=25, streaming_mode=StreamingMode.BIDI, ...).run_live()session and interact normally.run_live()generator completes β no more events are yielded.Expected Behavior:
At minimum, one of the following:
max_llm_callsis exhausted β e.g., anEventwith asession_endedormax_llm_calls_exhaustedfield, so the application can distinguish this from a normal generator completion.run_live()again with the same session, resetting the call counter (similar to how session resumption works for WebSocket timeouts).MaxLlmCallsExhaustedError) so the application can handle it distinctly from connection errors.Observed Behavior:
The
async for event in runner.run_live(...)loop silently ends. No terminal event, no exception. The application has no way to distinguish "max_llm_calls exhausted" from "model chose not to respond" or "generator completed normally after a clean session."Our Current Workaround:
We detect that the generator completed and send a custom WebSocket message to the frontend:
The frontend then creates a brand-new session with a new
session_idand re-connects. This works, but:session.stateto the new session).max_llm_callsexhaustion from other generator-completion scenarios.Questions:
max_llm_callsis exhausted?run_live()be called again on the same session (samesession_id) aftermax_llm_callsexhaustion to reset the counter and continue the conversation?Environment Details:
Model Information:
gemini-live-2.5-flash-native-audio(Vertex AI)Related Issues:
max_llm_calls)live_session_resumption_updatein events (cross-connection resumption)run_liveHow often has this issue occurred?:
Always (100%) β deterministic once the call counter reaches the limit.