Skip to content

VertexAiSessionService.getSession() silently drops last event(s) due to strict isBefore() timestamp filtering #1173

@DevTomek-pl

Description

@DevTomek-pl

🔴 Required Information

Describe the Bug:

VertexAiSessionService.getSession() uses a strict isBefore() comparison when filtering events against the session's updateTime. This causes the last event(s) to be silently dropped when the event's client-side timestamp is equal to or slightly after the server-side updateTime.

The filtering logic in VertexAiSessionService.filterEvents() (line ~240):

events.stream()
    .filter(event ->
        updateTimestamp == null
            || Instant.ofEpochMilli(event.timestamp()).isBefore(updateTimestamp))

Since event timestamps are set client-side (Instant.now() in Event.Builder.build()) and updateTime is set server-side by Vertex AI, clock skew between the ADK client JVM and the Vertex AI backend causes the last appended event(s) to be filtered out. In our observations, the client-side event timestamp can be over 100ms ahead of the server-side updateTime:

session.updateTime = 2026-04-29T11:00:05.940523Z  (epoch millis: 1777460405940)
event.timestamp    =                                (epoch millis: 1777460406103)
                                                     difference:          +163ms

This means the issue is not just a boundary/precision problem — !isAfter() would not fix it either. The fundamental problem is that filterEvents() compares timestamps from two different clock sources (client JVM vs Vertex AI server).

This is especially problematic in HITL (Human-in-the-Loop) tool approval flows: after Runner.runAsync() completes a resume with an adk_request_confirmation function response, the agent's final text response event has a timestamp >= updateTime and gets silently dropped. Subsequent getSession() calls return the session without the agent's answer.

Steps to Reproduce:

  1. Create a session and run an agent with a tool that requires HITL confirmation (beforeToolCallbackSync returning adk_request_confirmation)
  2. The agent pauses, waiting for user confirmation — a ToolConfirmationEvent is emitted
  3. Resume the agent by calling Runner.runAsync() with a FunctionResponse for adk_request_confirmation (confirmed = true)
  4. The agent processes the tool result and emits a final text response event
  5. Call sessionService.getSession(appName, userId, sessionId, Optional.empty()) to retrieve the session
  6. Observe that the agent's final text response event is missing from session.events()
  7. Call sessionService.listEvents(appName, userId, sessionId) separately — the event is present in the unfiltered list

Expected Behavior:

getSession() should return all events that belong to the session, including the most recently appended event(s). The filtering should use an inclusive comparison (<= / !isAfter()) so that events with timestamps equal to updateTime are not dropped.

Observed Behavior:

The last event(s) appended by Runner.runAsync() are silently filtered out by filterEvents() when their client-side timestamp is >= the server-side updateTime. This happens non-deterministically depending on clock skew between the ADK client JVM and the Vertex AI backend.

In our case, the agent's text response after a HITL tool approval is consistently missing from getSession() results but present in listEvents() results.

Environment Details:

  • ADK Library Version: 1.1.0 (com.google.adk:google-adk:1.1.0)
  • OS: Linux (production), macOS (development)
  • Java Version: 21

Model Information:

  • Model: gemini-2.5-pro (issue is model-independent — it's in the session service layer)

🟡 Optional Information

Regression:

N/A — this behavior appears to have been present since the filterEvents logic was introduced in VertexAiSessionService.

Logs:

No error logs are produced — the events are silently filtered. The only way to detect the issue is by comparing getSession().events() with listEvents().events().

Minimal Reproduction Code:

// Setup: agent with HITL tool requiring confirmation
// After the agent requests confirmation during runAsync():

// Step 1: Resume agent with approval
Content resumeMessage = Content.builder()
    .role("user")
    .parts(List.of(
        Part.builder()
            .functionResponse(
                FunctionResponse.builder()
                    .id(functionCallId)
                    .name("adk_request_confirmation")
                    .response(Map.of("confirmed", true))
                    .build())
            .build()))
    .build();

runner.runAsync(userId, sessionId, resumeMessage, RunConfig.builder().build())
    .blockingForEach(event -> {
        // Agent emits text response event here — gets appended to Vertex AI
    });

// Step 2: Retrieve session — last event is MISSING
Session session = sessionService
    .getSession(appName, userId, sessionId, Optional.empty())
    .blockingGet();
System.out.println("Events from getSession: " + session.events().size());

// Step 3: List events directly — last event IS PRESENT
ListEventsResponse eventsResponse = sessionService
    .listEvents(appName, userId, sessionId)
    .blockingGet();
System.out.println("Events from listEvents: " + eventsResponse.events().size());

// Output:
// Events from getSession: N      (missing last event)
// Events from listEvents: N+1    (all events present)

Suggested Fix:

The filterEvents() method in VertexAiSessionService should not compare client-side event timestamps against server-side updateTime, as these come from different clock sources with unpredictable skew. Possible approaches:

  1. Remove the timestamp-based filter entirely — rely only on numRecentEvents or afterTimestamp from GetSessionConfig when explicitly provided
  2. Use server-side timestamps for filtering — if Vertex AI returns a server-assigned timestamp for each event, use that instead of the client-side Event.timestamp()
  3. Add a tolerance/buffer — e.g. filter events where event.timestamp > updateTime + bufferMs, though this is fragile

Current problematic code in VertexAiSessionService.filterEvents():

// Compares client-side Instant.now() timestamps against server-side updateTime
// — these are from different clocks with unpredictable skew
events.stream()
    .filter(event ->
        updateTimestamp == null
            || Instant.ofEpochMilli(event.timestamp()).isBefore(updateTimestamp))

Workaround:

We currently work around this by calling listEvents() separately and rebuilding the Session with unfiltered events for read-only endpoints:

Session session = sessionService.getSession(appName, userId, sessionId, Optional.empty()).blockingGet();
List<Event> allEvents = sessionService.listEvents(appName, userId, sessionId).blockingGet().events();
allEvents.sort(Comparator.comparingLong(Event::timestamp));

Session fullSession = Session.builder(session.id())
    .appName(session.appName())
    .userId(session.userId())
    .lastUpdateTime(session.lastUpdateTime())
    .state(session.state())
    .events(allEvents)
    .build();

How often has this issue occurred?:

  • Often (50%+) — depends on clock skew between client JVM and Vertex AI backend. In our local development environment with HITL flows, we observed the client-side event timestamp being 163ms ahead of the server-side updateTime, causing the last event to be consistently dropped after tool approval resume. The issue may not reproduce in cloud-hosted environments where client and server clocks are better synchronized, but it is not guaranteed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions