-
Notifications
You must be signed in to change notification settings - Fork 20
[refactor] Semantic Function Clustering: Refactoring Opportunities in Go Source Files #3211
Description
Overview
Automated semantic function clustering analysis of 97 non-test Go files across 24 packages (683 function definitions). Since the previous analysis (issue closed today), two items have been resolved: generateRandomAPIKey was correctly moved to internal/auth/apikey.go as the exported GenerateRandomAPIKey(), and CLI flags were extracted from root.go into five dedicated flags_*.go files. Several new packages were added (strutil, syncutil, httputil, oidc, tracing) with appropriate organization. Core oversized-file and misplaced-function findings remain open.
Progress Since Last Analysis
| Item | Status |
|---|---|
Move generateRandomAPIKey to internal/auth/ |
✅ Done (now auth.GenerateRandomAPIKey() in auth/apikey.go) |
Extract CLI flags from cmd/root.go |
✅ Done (split into flags_core.go, flags_difc.go, flags_launch.go, flags_logging.go, flags_tracing.go) |
New utility packages added (strutil, syncutil, httputil, oidc, tracing) |
✅ Done (well-organized) |
Move ExpandEnvArgs out of config/docker_helpers.go |
⬜ Open |
Extract generic JSON helpers from server/difc_log.go |
⬜ Open |
Move non-lifecycle helpers out of cmd/root.go |
⬜ Open |
| Split oversized files (9 critical, 7 moderate) | ⬜ Open |
1. Misplaced Functions
1a. ExpandEnvArgs in config/docker_helpers.go
internal/config/docker_helpers.go:135 exports ExpandEnvArgs — a function that expands environment variables in command argument slices. Its only non-test caller is internal/mcp/connection.go:107:
// internal/mcp/connection.go
expandedArgs := config.ExpandEnvArgs(args)This creates an mcp → config dependency for a non-config concern. All other functions in docker_helpers.go are Docker-specific validators (checkDockerAccessible, validateContainerID, runDockerInspect, checkPortMapping, checkStdinInteractive, checkLogDirMounted). ExpandEnvArgs is a generic env-util, not Docker-specific and not config-specific.
Recommendation: Move ExpandEnvArgs to internal/envutil/ (already exists for env-var utilities) to eliminate the cross-package dependency.
Effort: 30–60 min
1b. resolveGuardPolicyOverride, writeGatewayConfig, loadEnvFile in cmd/root.go
internal/cmd/root.go (669 lines) still contains three substantial helper functions unrelated to CLI lifecycle wiring:
| Function | Lines | Nature |
|---|---|---|
resolveGuardPolicyOverride |
~50 | Guard policy resolution logic |
writeGatewayConfig / writeGatewayConfigToStdout |
~50 | Config serialization |
loadEnvFile |
~40 | .env file parsing |
These sit alongside the core CLI init(), preRun(), run(), postRun(), Execute(), SetVersion() functions. While flags were correctly extracted, these helpers remain.
Recommendation: Move resolveGuardPolicyOverride, writeGatewayConfig, writeGatewayConfigToStdout, and loadEnvFile to a new internal/cmd/run_helpers.go, keeping root.go focused on CLI lifecycle only.
Effort: < 1 h
2. Scattered Generic Helpers
2a. Generic JSON map helpers in server/difc_log.go
internal/server/difc_log.go:64–103 contains three domain-agnostic JSON traversal helpers:
func getStringField(m map[string]interface{}, fields ...string) string { ... }
func extractAuthorLogin(m map[string]interface{}) string { ... }
func extractNumberField(m map[string]interface{}) string { ... }These are general-purpose helpers, not specific to DIFC logging. internal/proxy/handler.go independently operates on map[string]interface{} values, confirming the risk of further reinvention. The new internal/httputil and internal/strutil packages demonstrate the project's growing pattern of extracting reusable utilities.
Recommendation (minimal): Add a comment marking them as local helpers to deter copy-paste.
Recommendation (better): Extract to a new internal/maputil/ package so any package can import them.
Effort: 1–2 h (if extracting to maputil)
3. Oversized Files — Candidates for Decomposition
Files combining multiple distinct responsibilities. All suggested splits stay within the same package; no exported API changes since all are under internal/.
3a. internal/guard/wasm.go — 1,168 lines ⚠️ CRITICAL
Five distinct responsibilities in one file:
| Responsibility | Key Functions | Suggested File |
|---|---|---|
| Guard lifecycle (constructor, Close) | NewWasmGuard*, Close |
keep in wasm.go |
| WASM runtime / memory management | callWasmFunction, tryCallWasmFunction, wasmAlloc, wasmDealloc, isWasmTrap |
wasm_runtime.go |
| Host function bindings | instantiateHostFunctions, hostCallBackend, hostLog |
wasm_host.go |
| Payload building | BuildLabelAgentPayload, buildStrictLabelAgentPayload, normalizePolicyPayload, isValidAllowOnlyRepos |
wasm_payload.go |
| Response parsing | parseLabelAgentResponse, parseResourceResponse, parseCollectionLabeledData, parsePathLabeledResponse, checkBoolFailure |
wasm_parse.go |
Recommendation: Split into wasm.go + wasm_runtime.go + wasm_host.go + wasm_payload.go + wasm_parse.go.
Effort: 3–4 h
3b. internal/config/guard_policy.go — 721 lines
Four distinct responsibilities:
| Responsibility | Key Functions | Suggested File |
|---|---|---|
| Core types + (un)marshaling | UnmarshalJSON ×2, MarshalJSON ×2, IsWriteSinkPolicy |
keep in guard_policy.go |
| Validation | ValidateGuardPolicy, ValidateWriteSinkPolicy, validateAcceptEntry, isValidRepo*, validateGuardPolicies |
guard_policy_validate.go |
| Parsing | ParseGuardPolicyJSON, ParsePolicyMap, ParseServerGuardPolicy, BuildAllowOnlyPolicy |
guard_policy_parse.go |
| Normalization | NormalizeGuardPolicy, normalizeAndValidateScopeArray, NormalizeScopeKind |
guard_policy_normalize.go |
Effort: 2–3 h
3c. internal/server/unified.go — 758 lines (grew from 714)
Combines core server setup, tool execution, DIFC integration, backend calling, lifecycle/shutdown, and enrichment token lookup. Notably this file has grown since the last analysis cycle.
| Responsibility | Key Functions | Suggested File |
|---|---|---|
| Core server + lifecycle | NewUnified, Run, Close, IsShutdown, InitiateShutdown, ShouldExit, SetHTTPShutdown, GetHTTPShutdown |
keep in unified.go |
| Tool execution + backend calling | callBackendTool, executeBackendToolCall, newErrorCallToolResult, guardBackendCaller |
unified_tools.go |
| Enrichment / env lookups | lookupEnrichmentToken, lookupGitHubAPIBaseURL |
unified_env.go |
| Status / introspection | GetServerIDs, GetServerStatus, GetToolsForBackend, GetToolHandler, GetPayloadSizeThreshold, IsDIFCEnabled, RegisterTestTool, SetTestMode |
unified_status.go |
Effort: 3–4 h
3d. internal/mcp/connection.go — 680 lines
Mixes connection construction, reconnection, session management, and MCP method wrappers:
| Responsibility | Suggested File |
|---|---|
| Core connection lifecycle | keep in connection.go |
| Send / reconnect logic | connection_send.go |
MCP method wrappers (listTools, callTool, listResources, getPrompt, etc.) |
connection_methods.go |
Effort: 2–3 h
3e. internal/mcp/http_transport.go — 643 lines
Mixes client construction, transport probing (streamable/SSE/plain-JSON), and request/response handling:
| Responsibility | Suggested File |
|---|---|
| Client construction + RoundTrip | keep in http_transport.go |
Transport probing (trySDKTransport, tryStreamableHTTPTransport, trySSETransport, tryPlainJSONTransport) |
http_transport_probe.go |
| Request/response execution | http_transport_request.go |
Effort: 2–3 h
3f. internal/cmd/root.go — 669 lines (grew from 614 despite flag extraction)
Despite extracting flags to dedicated files, root.go has grown. Helper functions remain embedded:
| Responsibility | Key Functions | Suggested File |
|---|---|---|
| Core CLI lifecycle | init, preRun, run, postRun, Execute, SetVersion |
keep in root.go |
| Helper utilities | resolveGuardPolicyOverride, writeGatewayConfigToStdout, writeGatewayConfig, loadEnvFile |
cmd/run_helpers.go |
Effort: 1–2 h
3g. internal/config/validation.go — 604 lines (grew from 465)
Mixes variable expansion, mount validation, server config validation, auth validation, gateway validation, and trusted bots validation:
| Responsibility | Suggested File |
|---|---|
| Variable expansion + core dispatch | keep in validation.go |
| Server mount validation | validation_mounts.go |
| Auth + gateway + trusted-bots validation | validation_auth.go |
Effort: 2–3 h
3h. internal/config/validation_schema.go — 550 lines
Mixes HTTP schema fetching, JSON-Schema compilation/validation, and multi-level error formatting:
| Responsibility | Suggested File |
|---|---|
Schema fetching + HTTP retry (isTransientHTTPError, fetchAndFixSchema, fixSchemaBytes) |
validation_schema_fetch.go |
Schema compilation + validation (getOrCompileSchema, validateJSONSchema) |
keep in validation_schema.go |
Error formatting (formatSchemaError, formatValidationErrorRecursive, formatErrorContext) |
validation_schema_format.go |
Effort: 1–2 h
3i. internal/config/config_stdin.go — 536 lines (grew from 517)
Mixes JSON parsing, type conversion/normalization, field stripping, and variable expansion:
| Responsibility | Suggested File |
|---|---|
JSON parsing + top-level loading (LoadFromStdin, UnmarshalJSON, stripExtensionFieldsForValidation) |
keep in config_stdin.go |
Type conversion + normalization (convertStdinConfig, convertStdinServerConfig, buildStdioServerConfig, normalizeLocalType) |
config_stdin_convert.go |
Effort: 1–2 h
3j. Other large files (moderate priority)
| File | Lines | Suggested Action |
|---|---|---|
internal/proxy/handler.go |
562 | Split HTTP handler from response restructuring helpers (rewrapSearchResponse, rebuildGraphQLResponse, replaceNodesArray, deepCloneJSON) |
internal/difc/evaluator.go |
449 | Split flow evaluation from label propagation logic |
internal/proxy/router.go |
444 | The 444 lines consist mostly of route table data (var routes = []route{...}); only 3 actual functions. Consider separating route data from logic. |
internal/middleware/jqschema.go |
456 | Split schema transform from file I/O for payloads |
internal/server/guard_init.go |
408 | Split guard registration / policy resolution / WASM discovery |
internal/server/tool_registry.go |
406 | Split tool registration from parallel/sequential launch strategies |
internal/proxy/proxy.go |
419 | Split proxy core from middleware/transport configuration |
4. Intentional Patterns (No Action Needed)
withLockon each logger type — identical body per type; correct because each is on a different receiver.setup*Logger/handle*LoggerError— different fallback strategies per logger type; differentiation is intentional.Log{Info,Warn,Error,Debug}[WithServer]families — three public APIs with distinct signatures; one-liner wrappers are idiomatic Go.- Session ID extraction split (
extractAndValidateSessionvsSessionIDFromContext) — different extraction points (header vs. context); not a duplicate. paginateAll[T]()— generic pagination helper, correctly placed inconnection.go.randomSerial()vsGenerateRandomAPIKey()vsgenerateRandomSpanID()— similar crypto-random patterns but serving distinct purposes (TLS cert serial, API key generation, tracing span ID); not true duplicates.getDefault*functions inflags_*.go— 13 functions following the same pattern are intentional: each reads a specific environment variable with a type-appropriate fallback.resolve*functions intracing/provider.go— 5 functions resolving different fields ofTracingConfig; each handles distinct types and fallback logic.
Implementation Checklist
Quick wins (< 1 hour each)
- Move
config/docker_helpers.go:ExpandEnvArgstointernal/envutil/ - Move
cmd/root.gohelpers (resolveGuardPolicyOverride,writeGatewayConfig,loadEnvFile) tointernal/cmd/run_helpers.go
Medium effort — split large files (no API breakage, all internal)
- Split
guard/wasm.go→wasm.go+wasm_runtime.go+wasm_host.go+wasm_payload.go+wasm_parse.go - Split
config/guard_policy.go→ core +_validate.go+_parse.go+_normalize.go - Split
server/unified.go→ core +unified_tools.go+unified_env.go+unified_status.go - Split
mcp/connection.go→ core +connection_send.go+connection_methods.go - Split
mcp/http_transport.go→ core +http_transport_probe.go+http_transport_request.go - Split
config/validation_schema.go→ core +_fetch.go+_format.go - Split
config/config_stdin.go→ parser +_convert.go - Split
config/validation.go→ core +validation_mounts.go+validation_auth.go
Optional / longer term
- Extract
getStringField/extractAuthorLogin/extractNumberFieldfromserver/difc_log.gotointernal/maputil/to prevent pattern reinvention - Review and split remaining >400-line files in
proxy/,middleware/,difc/,server/
Analysis Metadata
| Metric | Value |
|---|---|
| Go files analyzed (non-test) | 97 |
| Function definitions cataloged | 683 |
| Packages covered | 24 |
| Resolved since last analysis | 2 (generateRandomAPIKey → auth.GenerateRandomAPIKey(); flags extracted from root.go) |
| New well-organized packages added | 5 (strutil, syncutil, httputil, oidc, tracing) |
| Misplaced functions | 1 (ExpandEnvArgs) + 3 helpers embedded in root.go |
| Scattered helpers | 3 (getStringField, extractAuthorLogin, extractNumberField) |
| Files recommended for split (critical) | 9 |
| Files recommended for split (moderate) | 7 |
| Analysis date | 2026-04-05 |
References: §23999562027
Generated by Semantic Function Refactoring · ◷