fix(deploy): remove API_NUM_WORKERS footgun, scale via Ray Serve by Ahmath-Gadji · Pull Request #501 · linagora/openrag

Ahmath-Gadji · 2026-06-17T11:48:00Z

What & why

API_NUM_WORKERS is a footgun. In the uvicorn deployment path it fed uvicorn --workers N, but the app calls ray.init() at import time (openrag/api.py), so every uvicorn worker is a separate process that starts its own isolated Ray cluster with duplicate named actors (Indexer, Vectordb, TaskStateManager, the loader pools). Anything > 1 silently breaks the shared-actor architecture: task state fragments across clusters, multiple Indexer/Vectordb actors contend on the same Milvus collection / Postgres DB, and resources multiply N×.

Why it surfaced now

It was a no-op until v1.1.12. The entrypoint used to always pass --reload, and uvicorn dispatches should_reload before workers > 1, so --reload forced a single worker regardless of the value. Gating --reload behind UVICORN_RELOAD=true (#478, "N8") unmasked the setting — now all N workers actually start.

Full write-up in #500.

Changes

entrypoint.sh — the uvicorn path always runs a single worker (--workers 1). If API_NUM_WORKERS is set to a non-1 value, it logs a warning pointing operators to Ray Serve instead of silently misbehaving.
charts/openrag-stack/values.yaml — removed the dead API_NUM_WORKERS: "8". The chart sets ENABLE_RAY_SERVE: "true", which takes the api.py branch in the entrypoint and never reads API_NUM_WORKERS — so it was misleading, not live.
.env.example / docs/.../env_vars.md — dropped the knob and documented the correct scaling path.

The correct way to scale

Scale the HTTP layer with Ray Serve — N replicas inside one shared Ray cluster, so the named actors stay singletons:

ENABLE_RAY_SERVE=true
RAY_SERVE_NUM_REPLICAS=4

This is already what the Helm chart does. For multi-node, see the Ray cluster deployment guide.

Notes

No behavior change for existing single-worker deployments (the common case). Deployments that set API_NUM_WORKERS > 1 will now correctly run one worker and print a warning.
No migration needed.

Closes #500

Summary by CodeRabbit

Documentation
- Clarified that HTTP scaling is controlled via Ray Serve replicas (with sample Ray Serve configuration) rather than Uvicorn worker counts.
- Updated environment variable documentation and example env files to reflect single Uvicorn worker behavior and removed guidance for API_NUM_WORKERS.
- Added chart and .env.example comments directing users to scale using ENABLE_RAY_SERVE=true and RAY_SERVE_NUM_REPLICAS.
Chores
- Enforced running exactly one Uvicorn worker when not using Ray Serve.
- If API_NUM_WORKERS is set to a value other than "1", the app now warns that it’s ignored.

The uvicorn deployment path fed API_NUM_WORKERS into `uvicorn --workers N`, but the app calls ray.init() at import time, so each extra worker starts its own isolated Ray cluster with duplicate named actors (Indexer, Vectordb, TaskStateManager), fragmenting task state and vector-DB access. The flag was silently ignored until v1.1.12 because the entrypoint always passed --reload (which forces a single uvicorn worker); gating --reload behind UVICORN_RELOAD=true (PR #478, N8) unmasked it. - entrypoint.sh: always run a single uvicorn worker; warn if API_NUM_WORKERS is set to a non-1 value, pointing operators to Ray Serve. - charts: drop the dead API_NUM_WORKERS: "8" (the chart runs Ray Serve, which takes the api.py branch and never reads it). - .env.example / docs: remove the knob and document Ray Serve (ENABLE_RAY_SERVE + RAY_SERVE_NUM_REPLICAS) as the HTTP scaling path. Closes #500

coderabbitai · 2026-06-17T11:48:48Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b2d193aa-924b-479a-879b-b6e427792979

📥 Commits

Reviewing files that changed from the base of the PR and between 2ca0c6f and 319bc5c.

📒 Files selected for processing (2)

docs/assets/env_example.env
docs/assets/env_linux_gpu.env

✅ Files skipped from review due to trivial changes (2)

docs/assets/env_example.env
docs/assets/env_linux_gpu.env

📝 Walkthrough

Walkthrough

Removes API_NUM_WORKERS multi-worker support from the uvicorn startup path. entrypoint.sh now unconditionally passes --workers 1 to uvicorn and emits a stderr warning when API_NUM_WORKERS is set to a non-1 value. Related comments are added to .env.example and charts/openrag-stack/values.yaml, and env_vars.md is updated to document Ray Serve as the correct HTTP scaling path and remove the API_NUM_WORKERS entry.

Changes

Single Worker Enforcement and Documentation

Layer / File(s)	Summary
Enforce single uvicorn worker and warn on API_NUM_WORKERS `entrypoint.sh`	The non-Ray-Serve startup path always runs uvicorn with `--workers 1`. If `API_NUM_WORKERS` is set to any value other than `"1"`, a warning is printed to stderr that it is ignored. The previous conditional `--workers ${API_NUM_WORKERS}` mapping is removed.
Update env examples, Helm values, and env-vars docs `.env.example`, `charts/openrag-stack/values.yaml`, `docs/content/docs/documentation/env_vars.md`, `docs/assets/env_example.env`, `docs/assets/env_linux_gpu.env`	Adds inline comments to environment example files and `values.yaml` clarifying the single-worker design and pointing to `ENABLE_RAY_SERVE`/`RAY_SERVE_NUM_REPLICAS`. Expands the Ray Serve section in `env_vars.md` with actor-initialization rationale, a sample config snippet, and a link to distributed Ray cluster docs. Removes the `API_NUM_WORKERS` row from the FastAPI table.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 One worker, no more, no less—
The Ray actors share a single nest.
API_NUM_WORKERS? A warning now rings,
"Use Ray Serve replicas for scaling things!"
Hoppity-fix, the footgun is gone! 🎉

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: removing API_NUM_WORKERS and establishing Ray Serve as the scaling mechanism.
Linked Issues check	✅ Passed	All changes directly address issue `#500`'s objectives: single-worker enforcement in entrypoint.sh, warning on API_NUM_WORKERS mismatch, removal from Helm chart and documentation, and Ray Serve scaling guidance.
Out of Scope Changes check	✅ Passed	All changes are in-scope: entrypoint.sh, environment files, Helm values, and documentation updates align with the stated PR objectives and issue requirements.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/api-num-workers-footgun

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@entrypoint.sh`:
- Line 31: The uvicorn port binding uses an incorrect environment variable name
APP_iPORT (with a lowercase 'i') instead of APP_PORT (with a capital 'P').
Change the environment variable reference in the uv run command from APP_iPORT
to APP_PORT so that the correct application port environment variable is
recognized and used instead of always falling back to the default port 8080.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f02c847-0de0-4cc6-9acb-510604375ffd

📥 Commits

Reviewing files that changed from the base of the PR and between 7c2a35b and 0e5687b.

📒 Files selected for processing (4)

.env.example
charts/openrag-stack/values.yaml
docs/content/docs/documentation/env_vars.md
entrypoint.sh

.env.example removed the API_NUM_WORKERS knob, but its two hand-maintained mirrors under docs/assets/ (env_example.env, env_linux_gpu.env) still advertised it with the old, now-incorrect description. These files are embedded in the quickstart docs, so users following them would still copy the retired knob. Apply the same comment as .env.example pointing to the Ray Serve scaling path.

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread entrypoint.sh

Ahmath-Gadji force-pushed the fix/api-num-workers-footgun branch from 2ca0c6f to 319bc5c Compare June 17, 2026 15:42

Ahmath-Gadji merged commit 1b189a2 into main Jun 18, 2026
4 checks passed

Ahmath-Gadji deleted the fix/api-num-workers-footgun branch June 18, 2026 07:41

Ahmath-Gadji added the fix Fix issue label Jun 18, 2026

Ahmath-Gadji mentioned this pull request Jun 18, 2026

chore(release): bump version to 1.1.13 #506

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deploy): remove API_NUM_WORKERS footgun, scale via Ray Serve#501

fix(deploy): remove API_NUM_WORKERS footgun, scale via Ray Serve#501
Ahmath-Gadji merged 2 commits into
mainfrom
fix/api-num-workers-footgun

Ahmath-Gadji commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ahmath-Gadji commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & why

Why it surfaced now

Changes

The correct way to scale

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ahmath-Gadji commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading