Skip to content

Add temporal-spring-ai module#2829

Draft
donald-pinckney wants to merge 8 commits intomasterfrom
d/20260406-164203
Draft

Add temporal-spring-ai module#2829
donald-pinckney wants to merge 8 commits intomasterfrom
d/20260406-164203

Conversation

@donald-pinckney
Copy link
Copy Markdown
Contributor

@donald-pinckney donald-pinckney commented Apr 6, 2026

Summary

  • Adds temporal-spring-ai module that integrates Spring AI with Temporal workflows
  • AI model calls, vector store operations, embeddings, and MCP tool execution run as durable Temporal activities
  • Three-tier tool system: @DeterministicTool (pure functions), @SideEffectTool (cheap non-determinism via Workflow.sideEffect()), and activity-backed tools (durable I/O)
  • Auto-configuration via SpringAiPlugin that registers activities with all workers
  • Requires Java 17+ and Spring Boot 3.x / Spring AI 1.1.0

Related

Known issues to address

  • UUID.randomUUID() used in workflow context (LocalActivityToolCallbackWrapper)
  • TemporalStubUtil uses fragile string matching on internal handler class names
  • compileOnly deps (VectorStore, EmbeddingModel, MCP) cause ClassNotFoundException at runtime when not present — need to split plugin class or change dep scope
  • No depth limit on recursive tool call loop in ActivityChatModel.call()
  • No streaming support (expected, but should throw UnsupportedOperationException)
  • No tests yet

Test plan

  • Run chat sample end-to-end with Temporal dev server
  • Run RAG sample with vector store
  • Run MCP sample with filesystem MCP server
  • Add unit tests for type conversion (ChatModelTypes ↔ Spring AI types)
  • Add replay test to verify determinism
  • Test with multiple chat models (multi-model sample)

🤖 Generated with Claude Code

Adds a new module that integrates Spring AI with Temporal workflows,
enabling durable AI model calls, vector store operations, embeddings,
and MCP tool execution as Temporal activities.

Key components:
- ActivityChatModel: ChatModel implementation backed by activities
- TemporalChatClient: Temporal-aware ChatClient with tool detection
- SpringAiPlugin: Auto-registers Spring AI activities with workers
- Tool system: @DeterministicTool, @SideEffectTool, activity-backed tools
- MCP integration: ActivityMcpClient for durable MCP tool calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@donald-pinckney donald-pinckney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review: temporal-spring-ai plugin

What's done well

Determinism architecture is sound. The core design — routing all non-deterministic operations (LLM calls, vector store ops, embeddings, MCP tools) through Temporal activities — is exactly right. The three-tier tool system (@DeterministicTool for pure functions, @SideEffectTool for cheap non-determinism, activity stubs for durable I/O) maps cleanly onto Temporal's primitives.

Tool execution stays in the workflow. ChatModelActivityImpl sets internalToolExecutionEnabled(false) and only passes tool definitions to the model. The actual tool dispatch happens back in the workflow via ToolCallingManager, which means tool calls respect their Temporal wrapping (activity, sideEffect, etc.).

SideEffectToolCallback correctly wraps each call in Workflow.sideEffect(), recording the result in history on first execution and replaying the stored value.

ActivityChatModel.call() handles the tool loop correctly — it recursively calls itself when the model requests tools that don't returnDirect, maintaining proper conversation history.

Issues flagged inline below


@Override
public String call(String toolInput) {
String callbackId = UUID.randomUUID().toString();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity — non-deterministic call in workflow context.

UUID.randomUUID() is called from the call() method, which runs on the workflow thread. This violates Temporal's replay constraints — on replay a different UUID is generated.

In practice this may work because the local activity replays from markers rather than re-executing, but the intent is unclear and it depends on local activity replay semantics. Use Workflow.randomUUID() instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage added (WorkflowDeterminismTest). Ready to fix — next up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — replaced with Workflow.randomUUID().

*/
public class LocalActivityToolCallbackWrapper implements ToolCallback {

private static final Map<String, ToolCallback> CALLBACK_REGISTRY = new ConcurrentHashMap<>();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium severity — static registry lifecycle risk.

This static ConcurrentHashMap holds live ToolCallback references. Callbacks are removed in a finally block after the local activity completes, but if the workflow is evicted from the worker cache mid-execution (before finally runs), callbacks leak. Worth documenting or adding periodic cleanup.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added javadoc documenting the eviction leak risk and pointing to getRegisteredCallbackCount() for monitoring.

&& Proxy.getInvocationHandler(object)
.getClass()
.getName()
.contains("ActivityInvocationHandler")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium severity — fragile string matching.

Checking getInvocationHandler().getClass().getName().contains("ActivityInvocationHandler") is brittle. If the SDK renames or refactors these internal handler classes, detection silently breaks and tools fall through to the "unknown type" error. Since this plugin now lives in the SDK repo, consider using internal SDK APIs directly, or at minimum add a comment acknowledging the coupling.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage added (TemporalToolUtilTest, 22 tests). Ready to refactor — next up.

.build();
} else {
// Send tool results back to the model (recursive call)
return call(new Prompt(toolExecutionResult.conversationHistory(), prompt.getOptions()));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium severity — unbounded recursive tool loop.

This recursively calls call() when the model keeps requesting tools. A misbehaving model could cause infinite recursion / stack overflow. Consider adding a max iteration count. Spring AI's default ToolCallingManager has one internally, but since the loop is driven manually here, that limit is bypassed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage added (ChatModelActivityImplTest, 10 tests). Ready to add loop limit — next up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — converted recursive call to iterative loop with MAX_TOOL_CALL_ITERATIONS = 10 limit.

Worker worker, List<?> clients, List<String> registeredActivities) {
try {
// Use reflection to avoid loading MCP classes when not on classpath
Class<?> mcpActivityClass = Class.forName("io.temporal.springai.mcp.McpClientActivityImpl");
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low severity — unnecessary reflection.

Class.forName("io.temporal.springai.mcp.McpClientActivityImpl") uses reflection to avoid a compile-time dependency, but McpClientActivityImpl is in the same module. This reflection is avoiding a dependency that already exists. If MCP is truly optional, consider moving it to a separate module. If not, just use direct references.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage added (SpringAiPluginTest, 10 tests). Ready for the conditional auto-config split (T6) — next up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — resolved by T6. McpPlugin directly references McpClientActivityImpl, no more reflection.

compileOnly project(':temporal-spring-boot-autoconfigure')

api 'org.springframework.boot:spring-boot-autoconfigure'
api 'org.springframework.ai:spring-ai-client-chat'
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High severity — compileOnly deps needed at runtime.

VectorStore, EmbeddingModel, and MCP are compileOnly here, but SpringAiPlugin directly references these classes in field declarations and method signatures. When they're not on the runtime classpath, Spring fails to introspect the class at all (ClassNotFoundException: org.springframework.ai.vectorstore.VectorStore).

Either:

  1. Change to implementation (simplest, but makes them transitive)
  2. Split SpringAiPlugin so VectorStore/Embedding/MCP handling is in separate classes loaded conditionally
  3. Use runtimeOnly in consumer builds (what we had to do in the samples)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: go with option 2 (split into conditional auto-configuration classes).

Confirmed that Spring AI itself does NOT transitively pull in RAG, vector store, embeddings, or MCP — those are separate opt-in starters. We shouldn't be heavier than the framework we're integrating with.

Plan: keep SpringAiPlugin with only ChatModel (required). Add separate @ConditionalOnClass-guarded config classes for VectorStore, EmbeddingModel, and MCP that register their own activities. The compileOnly scope then stays correct — Spring skips the conditional classes entirely when the dep isn't present.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — split into 4 auto-config classes: SpringAiTemporalAutoConfiguration (core), SpringAiVectorStoreAutoConfiguration, SpringAiEmbeddingAutoConfiguration, SpringAiMcpAutoConfiguration. Each guarded by @ConditionalOnClass. compileOnly scopes now work correctly.

* @see #forDefault()
* @see #forModel(String)
*/
public class ActivityChatModel implements ChatModel {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low severity — no streaming support.

ActivityChatModel only implements synchronous call(). Streaming through activities doesn't make sense, but the class should override stream() to throw a clear UnsupportedOperationException rather than silently inheriting a default that may behave unexpectedly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — stream(Prompt) now throws UnsupportedOperationException with a clear message explaining that Temporal activities are request/response based.

donald-pinckney and others added 3 commits April 6, 2026 17:09
T9: Add javadoc to LocalActivityToolCallbackWrapper explaining the leak
risk when workflows are evicted from worker cache mid-execution.

T11: Override stream() in ActivityChatModel to throw
UnsupportedOperationException with a clear message, since streaming
through Temporal activities is not supported.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
T1: ChatModelActivityImplTest (10 tests) - type conversion between
    ChatModelTypes and Spring AI types, multi-model resolution,
    tool definition passthrough, model options mapping.

T2: TemporalToolUtilTest (22 tests) - tool detection and conversion
    for @DeterministicTool, @SideEffectTool, stub type detection,
    error cases for unknown/null types.

T3: WorkflowDeterminismTest (2 tests) - verifies workflows using
    ActivityChatModel with tools complete without non-determinism
    errors in the Temporal test environment.

T4: SpringAiPluginTest (10 tests) - plugin registration with various
    bean combinations, multi-model support, default model resolution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@donald-pinckney donald-pinckney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found an additional bug during testing.

metadata = ChatResponseMetadata.builder().model(output.metadata().model()).build();
}

return ChatResponse.builder().generations(generations).metadata(metadata).build();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: NPE when metadata is null.

When output.metadata() is null, metadata stays null and .metadata(null) is passed to ChatResponse.builder(). Spring AI's builder throws an NPE on null metadata.

Fix: only call .metadata() on the builder when metadata is non-null, or pass an empty ChatResponseMetadata.builder().build().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — builder now only calls .metadata() when metadata is non-null.

donald-pinckney and others added 4 commits April 6, 2026 17:27
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
T5: Replace UUID.randomUUID() with Workflow.randomUUID() in
LocalActivityToolCallbackWrapper to ensure deterministic replay.

T7: Convert recursive tool call loop in ActivityChatModel.call() to
iterative loop with MAX_TOOL_CALL_ITERATIONS (10) limit to prevent
infinite recursion from misbehaving models.

T14: Fix NPE when ChatResponse metadata is null by only calling
.metadata() on the builder when metadata is non-null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split the monolithic SpringAiPlugin into one core plugin + three
optional plugins, each with its own @ConditionalOnClass-guarded
auto-configuration:

- SpringAiPlugin: core chat + ExecuteToolLocalActivity (always)
- VectorStorePlugin: VectorStore activity (when spring-ai-rag present)
- EmbeddingModelPlugin: EmbeddingModel activity (when spring-ai-rag present)
- McpPlugin: MCP activity (when spring-ai-mcp present)

This fixes ClassNotFoundException when optional deps aren't on the
runtime classpath. compileOnly scopes now work correctly because
Spring skips loading the conditional classes entirely when the
@ConditionalOnClass check fails.

Also resolves T10 (unnecessary MCP reflection) — McpPlugin directly
references McpClientActivityImpl instead of using Class.forName().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant