Skip to content

test(ci): worktree-multi-concurrent-e2e — add non-worktree base stack (3 → 4 concurrent stacks)#843

Merged
bradymiller merged 4 commits into
openemr:masterfrom
bradymiller:test/multi-concurrent-add-nonworktree
Jun 26, 2026
Merged

test(ci): worktree-multi-concurrent-e2e — add non-worktree base stack (3 → 4 concurrent stacks)#843
bradymiller merged 4 commits into
openemr:masterfrom
bradymiller:test/multi-concurrent-add-nonworktree

Conversation

@bradymiller

@bradymiller bradymiller commented Jun 26, 2026

Copy link
Copy Markdown
Member

Summary

Extends worktree-multi-concurrent-e2e from 3 stacks to 4 by adding the plain non-worktree base-repo stack (openemr-cmd up from openemr/docker/development-easy/) alongside the three worktree stacks.

Stack Mode Port Compose project
multi-easy worktree 8301 openemr-multi-easy
multi-light worktree 8302 openemr-multi-light
multi-redis worktree 8303 openemr-multi-redis
(none) non-worktree 8300 development-easy

No port or project-name conflict by design.

What this newly verifies

  • Coexistence: worktree + non-worktree stacks run side-by-side without daemon contention deadlock, port collision, or compose-project name conflict
  • HOST_UID adoption in both modes simultaneously: new "Verify bind mount is host-owned in BOTH modes" step stats sites/default/sqlconf.php across all 4 bind mounts and asserts ownership == runner uid. If adoption regresses only under concurrency (e.g., HOST_UID export race), this job catches it
  • Hygiene parity: extended teardown verification to confirm the development-easy project's containers + volumes are also gone (label-based filter for non-worktree volumes since they use default names without the openemr-multi-*_ prefix)

Lifecycle delta vs prior 3-stack version

  • Added: openemr-cmd up step from docker/development-easy/ after the 3 worktree adds
  • Expanded: healthy-poll waits for all 4 containers; HTTP smoke covers 4 ports; cross-talk check requires 4 distinct container IDs
  • Added: bind-mount-ownership assertion across all 4 stacks (the key new permission coverage)
  • Added: openemr-cmd down (with volumes) for the non-worktree stack first (frees daemon bandwidth for the subsequent worktree removes)
  • Extended: hygiene block to cover the development-easy compose project too

Cost

Timeout bumped 40 → 50 min for the fourth parallel boot. Public-repo runner so no $ cost; the only real cost is reviewer attention if a regression specifically surfaces under 4-way concurrency (which is the signal we want).

The standalone nonworktree-lifecycle-e2e job still exists and covers the non-worktree path in isolation — this PR adds the concurrency dimension, not a replacement.

Test plan

  • All 4 containers reach healthy under concurrent load
  • All 4 HTTP smoke checks pass
  • Bind-mount-ownership assertion passes for all 4 stacks
  • Teardown leaves no orphans (containers, volumes, state, dirs, git registrations) across all 4 projects

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Expanded concurrent end-to-end coverage to run four OpenEMR stacks in parallel, adding a default-port instance alongside the existing worktree-based instances.
    • Added comprehensive smoke/health checks across all four exposed ports.
  • Bug Fixes

    • Improved readiness gating to wait for all containers to be healthy before proceeding, with more reliable handling of transient health states.
    • Strengthened isolation, filesystem/ownership verification, teardown ordering, and failure diagnostics/cleanup to include the new default-port stack.

… (3 → 4 concurrent stacks)

Extends the multi-concurrent job to run a fourth stack alongside the
three worktree ones: the plain non-worktree base-repo stack from
openemr/docker/development-easy/ via `openemr-cmd up`. Uses the
default port 8300 (no offset) — no conflict with worktree offsets
8301/02/03.

What it newly verifies:
- Worktree (3 stacks) + non-worktree (1 stack) coexist under
  concurrent load. No port collision, no compose-project name
  collision (`development-easy` vs `openemr-multi-*`), no daemon
  serialization deadlock.
- HOST_UID adoption holds for BOTH modes simultaneously — the new
  "Verify bind mount is host-owned in BOTH modes" step stats
  sites/default/sqlconf.php across all four bind mounts and asserts
  ownership == runner uid. If adoption regresses in either mode,
  this catches it specifically; if it regresses ONLY under
  concurrency (e.g., race with HOST_UID export), this job is the
  only place we'd see it.
- The non-worktree-only e2e job (nonworktree-lifecycle-e2e) covers
  the same ownership assertion in isolation; this job adds the
  concurrency dimension.

Lifecycle:
- Add 4th stack via `openemr-cmd up` from docker/development-easy/
  after the three worktree adds
- Expand healthy-poll to wait for all 4 containers
- Expand HTTP smoke to include port 8300
- Expand cross-talk check to confirm 4 distinct container IDs
- Tear down non-worktree first (via `openemr-cmd down -v`) to free
  daemon bandwidth for the worktree removes
- Extend hygiene check to also verify `development-easy` project
  containers + volumes are gone (label-based filter for the volumes
  since the non-worktree project uses default volume names without
  the `openemr-multi-*_` prefix)

Job name: "three concurrent worktrees" → "four concurrent stacks
(3 worktrees + 1 non-worktree base)".

Timeout bumped 40 → 50 min for the fourth parallel boot.

Assisted-by: Claude Code
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6148dcd3-f48b-4211-a2d0-49d04e0af84d

📥 Commits

Reviewing files that changed from the base of the PR and between 50c6eac and 2cb2ee0.

📒 Files selected for processing (1)
  • .github/workflows/test-bats-openemr-cmd-real-docker.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/test-bats-openemr-cmd-real-docker.yml

📝 Walkthrough

Walkthrough

The workflow job now runs four concurrent OpenEMR stacks: three worktree stacks and one development-easy stack. It waits for health, checks HTTP access and container identity across all four, verifies bind-mount ownership, and updates teardown, cleanup, and diagnostics.

Changes

Concurrent OpenEMR stack validation

Layer / File(s) Summary
Job scope and startup
.github/workflows/test-bats-openemr-cmd-real-docker.yml
Updates the job description and timeout, then starts three worktree stacks and one development-easy stack.
Health and smoke checks
.github/workflows/test-bats-openemr-cmd-real-docker.yml
Polls all four containers for health, runs HTTP smoke checks on ports 8300-8303, and expects four distinct OpenEMR container IDs.
Teardown and cleanup
.github/workflows/test-bats-openemr-cmd-real-docker.yml
Verifies sites/default/sqlconf.php ownership for all four stacks, tears down development-easy before the worktrees, checks container and volume removal, and includes development-easy in failure logs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇 Four stacks hopped into the Docker night,
Ports 8300 to 8303 all shining bright.
Soft ears perked at healthy pings in a row,
No cross-talk, no tangles, just clean-running flow.
Hop, hop, hooray — the containers behave just right!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding a non-worktree base stack and increasing concurrency from 3 to 4 stacks.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

…poser-install load

The previous push had easy stack briefly flap to unhealthy at 260s
(~80s past start_period) while still in the composer-install phase.
The 3-stack version of this job had the same "bail on first unhealthy"
logic and worked because 3 stacks always finished install inside
start_period (3m); 4 stacks share CPU/disk more heavily and the
install drags past start_period, allowing one transient healthcheck
failure to trip the bail.

Fix: require SUSTAINED unhealthy (≥6 consecutive polls = ~60s)
before bailing. Single flaps recover on the next poll without
killing the whole job. The 15-min poll timeout is still the final
guard against genuinely-broken containers.

Assisted-by: Claude Code
…eout to 30m + outer 60m

Previous fix (sustained-unhealthy bail at 60s) wasn't enough — all
four containers cascade into unhealthy state simultaneously during
composer install under 4-way contention, all going past start_period
together. The 60s threshold caught the cascade as soon as the streak
hit it.

Reality check from CI: each container's composer install takes
~15-20 min when 4 share CPU/disk/image-pull bandwidth (vs ~5 min
single-stack baseline). During that long install the
`/meta/health/readyz` probes fail because apache isn't fully serving
yet. So "unhealthy for >60s" is the EXPECTED state, not a regression
signal — bailing on it killed perfectly healthy progress.

Final fix:
- Remove the unhealthy-bail entirely. Just poll until all-four-healthy
  or the inner loop times out at 30 min (180 polls × 10s).
- Outer job timeout bumped 50 → 60 min for the longer healthy-poll
  budget + downstream steps (smoke, exec, ownership, teardown,
  hygiene ~ 15 min total).
- A genuinely-broken container is still caught: the 30-min inner
  timeout fires with a clear message + all four container logs.

The 3-stack version still uses the original bail-on-first-unhealthy
because 3 stacks DO finish install inside start_period (verified by
its passing CI for months); only 4-stack contention breaks that
assumption.

Assisted-by: Claude Code

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/test-bats-openemr-cmd-real-docker.yml (1)

704-706: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Validate worktree offsets before adding the nonwt:0 smoke target.

If any jq -r offset is null or 0, Bash arithmetic maps that worktree smoke probe to port 8300, so it can hit the non-worktree stack and hide an offset regression.

Proposed fix
           oe=$(jq -r '."multi-easy".offset'  .worktrees.json)
           ol=$(jq -r '."multi-light".offset' .worktrees.json)
           or=$(jq -r '."multi-redis".offset' .worktrees.json)
+          for branch_offset in "multi-easy:${oe}" "multi-light:${ol}" "multi-redis:${or}"; do
+            branch="${branch_offset%%:*}"
+            offset="${branch_offset#*:}"
+            [[ "${offset}" =~ ^[1-9][0-9]*$ ]] || { echo "FAIL: ${branch} offset is not a positive integer: '${offset}'"; exit 1; }
+          done
+          offset_set=$(printf '%s\n%s\n%s\n' "${oe}" "${ol}" "${or}" | sort -n | paste -sd, -)
+          [[ "${offset_set}" = "1,2,3" ]] || { echo "FAIL: expected worktree offsets 1,2,3; got ${offset_set}"; exit 1; }
           for entry in "easy:${oe}" "light:${ol}" "redis:${or}" "nonwt:0"; do
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-bats-openemr-cmd-real-docker.yml around lines 704 -
706, The smoke-target loop is using raw offsets in the port calculation, so a
null or zero worktree offset can collapse to the same port as the non-worktree
stack. Update the workflow logic around the worktree offset extraction and the
entry loop so offsets are validated before use, and ensure the `nonwt` probe is
kept separate from any worktree-based ports. Use the `jq -r` offset handling and
the `for entry in ...` block to guard against null/0 values before computing
`port`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-bats-openemr-cmd-real-docker.yml:
- Around line 674-684: The health-check loop in the workflow currently keeps
retrying even when docker inspect returns missing for any of the containers,
which should be treated as an immediate failure. Update the loop around the
container status checks for cname_e, cname_l, cname_r, and cname_n so that
missing container names fail fast with a clear error, while still continuing to
retry only when the status is unhealthy. Keep the existing success path in the
same health-wait section and preserve the current container status reporting.

---

Outside diff comments:
In @.github/workflows/test-bats-openemr-cmd-real-docker.yml:
- Around line 704-706: The smoke-target loop is using raw offsets in the port
calculation, so a null or zero worktree offset can collapse to the same port as
the non-worktree stack. Update the workflow logic around the worktree offset
extraction and the entry loop so offsets are validated before use, and ensure
the `nonwt` probe is kept separate from any worktree-based ports. Use the `jq
-r` offset handling and the `for entry in ...` block to guard against null/0
values before computing `port`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d7e8dc53-dc58-4a5d-bda0-7a44506f0578

📥 Commits

Reviewing files that changed from the base of the PR and between beaa6ef and 50c6eac.

📒 Files selected for processing (1)
  • .github/workflows/test-bats-openemr-cmd-real-docker.yml

Comment thread .github/workflows/test-bats-openemr-cmd-real-docker.yml
…erally absent)

CodeRabbit catch: the no-bail rule for `unhealthy` was right (slow
composer install under 4-way contention transiently flaps probes),
but it ALSO swallowed `missing` — the container literally not
existing — which is a hard failure not worth waiting 30 min on.

Cause analysis matrix:
- `missing` = compose failed to create the container, or the name/
  project naming contract changed. No amount of waiting fixes it.
  Bail immediately with `docker ps -a` dump.
- `starting` / `unhealthy` = slow probe results during install.
  Keep polling — this is the expected transient state under load.
- `healthy` (all 4) = ready, exit 0.

Compose up -d returns after creating containers, so by iteration 1
of this poll all four should exist; missing at iter 1+ means
something broke at the orchestration layer, not the application
layer.

Assisted-by: Claude Code
@bradymiller bradymiller merged commit 8168856 into openemr:master Jun 26, 2026
9 checks passed
@bradymiller bradymiller deleted the test/multi-concurrent-add-nonworktree branch June 26, 2026 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant