Skip to content

Add QuerySpaceNormalizingMiddleware server middleware to handle clients using plus-encoding#194

Open
rbranche wants to merge 1 commit into
apple:mainfrom
rbranche:eng/PR-176480710
Open

Add QuerySpaceNormalizingMiddleware server middleware to handle clients using plus-encoding#194
rbranche wants to merge 1 commit into
apple:mainfrom
rbranche:eng/PR-176480710

Conversation

@rbranche
Copy link
Copy Markdown

@rbranche rbranche commented May 11, 2026

Motivation

The OpenAPI specification follows RFC 3986, which requires spaces in query strings to be percent-encoded as %20. However, many widely-used HTTP client libraries — including Python requests, Java URLEncoder, Scala sttp, and JavaScript URLSearchParams — encode query parameter spaces as +, following the application/x-www-form-urlencoded convention. Without any handling for this, query parameters sent from such clients arrive at generated server handlers with literal + characters instead of spaces.

Per maintainer feedback, rather than loosening the library's default decoder (which would impose the cost on every user), this behavior should be provided as opt-in middleware — following the same pattern as ErrorHandlingMiddleware. Servers that only need to cater to spec-compliant clients don't pay the cost, and servers that want lenience can configure it in a single line.

Modifications

  • Added QuerySpaceNormalizingMiddleware, an opt-in ServerMiddleware that rewrites unencoded + characters in the request's query string to %20 before the request reaches the decoder. Only the query portion of the request path is modified; the path, fragment, and request body are untouched. Literal + characters that were correctly percent-encoded as %2B by the client are unaffected.
  • Added an ### Additional middlewares topic in the DocC documentation that lists ErrorHandlingMiddleware and the new QuerySpaceNormalizingMiddleware, so built-in middlewares are discoverable from the top-level docs.
  • Added HTTPResponseConvertible into the ### Errors topic (it's a protocol used by ErrorHandlingMiddleware).

Result

Server authors who need to accept +-encoded spaces can opt in with a single line:

try handler.registerHandlers(on: transport, middlewares: [QuerySpaceNormalizingMiddleware()])

Servers that don't register the middleware continue to behave exactly as before (spec-compliant %20 only).

Test Plan

  • 10 new unit tests in Test_QuerySpaceNormalizingMiddleware cover: +-in-query → %20, multiple +s, pre-existing %20 untouched, %2B literal-plus untouched, + in path not replaced, + in fragment not replaced, path without a query, query without +, empty query, and response pass-through.
  • Planning to also exercise this in our service that depends on this library.

@rbranche
Copy link
Copy Markdown
Author

Discussed offline, and the team would like this to be a shared Middleware instead of built into the runtime itself. Working on that now.

@rbranche rbranche force-pushed the eng/PR-176480710 branch 2 times, most recently from 4892c75 to c5f1d13 Compare May 13, 2026 19:49
@rbranche rbranche changed the title Properly support both types ("+" / "%20") of URI encodings for " " (space) Support both types ("+" / "%20") of URI encodings for " " (space) in the library May 13, 2026
@simonjbeaumont
Copy link
Copy Markdown
Collaborator

Discussed offline, and the team would like this to be a shared Middleware instead of built into the runtime itself. Working on that now.

👍 Thanks @rbranche!

For others following this thread, this approach is preferred because we don't typically add workarounds for things that aren't expressly part of the spec on the default codepath.

This includes codegen preferences for off-piste doc idioms, and external clients and servers that send requests and responses that do not conform to the spec. We have the middleware as an escape hatch for these sorts of compatibility issues, which presents an opt-in approach for only those that need it.

For middleware that are generalisable enough to have wide appeal, we've also been happy to add these to the runtime package itself (e.g. the ErrorHandlingMiddleware). This lenient encoding middleware probably meets the bar for that.

Copy link
Copy Markdown
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the rework. This patch is looking in really good shape. A couple of minor notes.

Comment on lines +59 to +68
if let path = request.path, let queryStart = path.firstIndex(of: "?") {
let queryEnd = path.firstIndex(of: "#") ?? path.endIndex
let query = path[path.index(after: queryStart)..<queryEnd]
if query.contains("+") {
let normalizedQuery = query.replacingOccurrences(of: "+", with: "%20")
let prefix = path[..<path.index(after: queryStart)]
let suffix = path[queryEnd...]
request.path = String(prefix) + normalizedQuery + String(suffix)
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a malicious client sends a malformed request with the ? inside the fragment, e.g. /a#b?c+d, I think this would crash in the range operator.

Can we add a test to confirm?

We'll probably need to first find the start of the fragment, if any, to determine where the query starts and ends. Something like:

        if let path = request.path {
            let fragmentStart = path.firstIndex(of: "#") ?? path.endIndex
            let pathAndQuery = path[..<fragmentStart]
            if let queryStart = pathAndQuery.firstIndex(of: "?") {
                let query = path[path.index(after: queryStart)..<fragmentStart]
                if query.contains("+") {
                    // ... the stuff you did here
                }
            }
        }

Comment on lines +42 to +43
/// - Note: Only the query portion of the request path is modified. The path
/// itself and the request body are not touched.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit.

Suggested change
/// - Note: Only the query portion of the request path is modified. The path
/// itself and the request body are not touched.
/// - Note: Only the query of the request is modified. The path, fragment, and
/// body are not touched.

let normalizedQuery = query.replacingOccurrences(of: "+", with: "%20")
let prefix = path[..<path.index(after: queryStart)]
let suffix = path[queryEnd...]
request.path = String(prefix) + normalizedQuery + String(suffix)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this will be on a pretty hot path, we should probably reduce the allocations made by the creation of the intermediate strings by using reserveCapacity and append when building up the final string.

If we really wanted to eek out performance from this, we'd probably also want to drop down to String.UTF8View for the processing but, given this is opt-in, I don't mind us deferring that.

@simonjbeaumont simonjbeaumont added the 🆕 semver/minor Adds new public API. label May 14, 2026
@rbranche rbranche force-pushed the eng/PR-176480710 branch 2 times, most recently from aefce2c to bd2a2a6 Compare May 14, 2026 18:32
Add an opt-in ServerMiddleware that rewrites unencoded "+"
characters in the request's query string to "%20" before the
request reaches the decoder. This lets servers interoperate
with clients that encode query parameter spaces using the
application/x-www-form-urlencoded convention (as used by
Python requests, Java URLEncoder, Scala sttp, and JavaScript
URLSearchParams) without changing the default RFC 3986
behavior of the generated server code.

Only the query of the request is modified; the path, fragment,
and body are untouched. Literal "+" characters that were
correctly percent-encoded as "%2B" by the client are unaffected.
A malformed input where "?" appears inside the fragment (e.g.
"/a#b?c+d") is left untouched rather than crashing.

The rewrite builds the new path with reserveCapacity and
character-by-character append to avoid the intermediate string
allocations of replacingOccurrences and concatenation.

Also add an "Additional middlewares" section to the DocC
documentation that lists ErrorHandlingMiddleware alongside the
new middleware, so built-in middlewares are discoverable from
the top-level docs. Move HTTPResponseConvertible into the
existing "Errors" section, where it belongs.
@rbranche rbranche force-pushed the eng/PR-176480710 branch from bd2a2a6 to 510dba9 Compare May 14, 2026 19:24
Copy link
Copy Markdown
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition—thank you for the contribution! 🙏

@simonjbeaumont simonjbeaumont changed the title Support both types ("+" / "%20") of URI encodings for " " (space) in the library Add QuerySpaceNormalizingMiddleware server middleware to handle clients using plus-encoding May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🆕 semver/minor Adds new public API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants