Skip to content

ci(openemr-cmd): add macos-smoke-e2e on macos-14 via colima (daily-cron only)#844

Closed
bradymiller wants to merge 7 commits into
openemr:masterfrom
bradymiller:ci/macos-smoke-daily
Closed

ci(openemr-cmd): add macos-smoke-e2e on macos-14 via colima (daily-cron only)#844
bradymiller wants to merge 7 commits into
openemr:masterfrom
bradymiller:ci/macos-smoke-daily

Conversation

@bradymiller

@bradymiller bradymiller commented Jun 26, 2026

Copy link
Copy Markdown
Member

⚠️ Pre-merge cleanup required

The if: guard temporarily includes pull_request so this PR's CI can validate the job actually works end-to-end on a macos-14 runner (the alternative would be merging schedule-only blind). Before merge, drop the pull_request clause so the guard reverts to:

if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}

Summary

New e2e job exercising one nonworktree-style stack (`openemr-cmd up` from `docker/development-easy/`) on a `macos-14` runner via colima. Pins two contracts on macOS Docker:

  1. openemr-cmd happy path works on macOS (not just Linux Docker Engine)
  2. HOST_UID adoption holds through colima's Lima VM boundary — `sites/default/sqlconf.php` should end up owned by the runner uid on the host (same probe as Linux `nonworktree-lifecycle-e2e`)

Triggers

`schedule` + `workflow_dispatch` only. Skipped on `push`/`pull_request` via:
```yaml
if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
```

Rationale: keeps PR cycle fast. macOS runners are limited in the GH-hosted pool and a stack startup on Apple Silicon + qemu amd64 emulation takes 15-25 min vs ~5 min on Linux. Catching macOS-specific divergence within 24h via the daily cron is the right tradeoff.

Scope (intentionally narrow vs the Linux jobs)

Linux macOS smoke
worktree lifecycle
multi-concurrent (4 stacks)
nonworktree lifecycle ✓ (this job)
functional round-trip
prek install + real commit

Just one stack, one health check, one HTTP smoke, one ownership assertion, one teardown. Heavier macOS-specific coverage can be filed later if this proves stable.

Docker setup

```yaml

  • run: brew install colima docker docker-compose
  • run: colima start --cpu 4 --memory 8 --disk 30
    ```

macOS GH-hosted runners have brew preinstalled but no Docker. Apple's license forbids Docker Desktop on hosted runners; colima is unrestricted.

Validation path

This PR can't validate the job end-to-end (it's gated to schedule/dispatch). After merge:

  1. GitHub Actions UI → openemr-cmd e2e workflow → "Run workflow" → branch=master to verify the job works
  2. Tomorrow's daily cron at 06:17 UTC will fire it automatically

If something breaks at runtime, the diagnostics step dumps docker ps, volume ls, colima status, and the last 500 lines of both openemr + mysql container logs.

Test plan

  • YAML validates (verified locally via Python yaml.safe_load)
  • Job's `if:` guard correctly skips on this PR (visible in the Actions tab — the job should NOT run on this PR)
  • Other 5 e2e jobs unaffected — additive change
  • After merge: manual `workflow_dispatch` run reaches "OK: bind-mount file owned by uid=…" on macos-14

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Tests
    • Added a new macOS smoke-test (tier-2) GitHub Actions job that runs a minimal openemr-cmd up flow on macOS 14 using a colima-provided Docker VM.
    • Provisions the VM with increased CPU/RAM/disk, waits for container health (longer timeout), and performs an HTTP smoke check on localhost:8300.
    • Verifies macOS bind-mount ownership via host UID checks, improves failure diagnostics, and ensures reliable teardown with openemr-cmd down.

…on only)

New job exercises one nonworktree-style stack (`openemr-cmd up` from
docker/development-easy/) on a macos-14 runner via colima. Pins:
  - openemr-cmd happy path works on macOS Docker (not just Linux)
  - HOST_UID adoption holds through colima's Lima VM boundary —
    sites/default/sqlconf.php should end up owned by the runner uid
    on the host (same probe as Linux nonworktree-lifecycle-e2e)

Triggers: schedule + workflow_dispatch only. Skipped on push/pull_request
via `if: github.event_name == 'schedule' || github.event_name ==
'workflow_dispatch'`. Keeps PR cycle fast — macOS runners are limited
in the GH-hosted pool and a stack startup on Apple Silicon + qemu
amd64 emulation takes 15-25 min vs ~5 min on Linux.

Scope intentionally narrow vs the Linux jobs: no worktree lifecycle,
no multi-concurrent, no prek, no functional round-trip. Just one
stack, one health check, one HTTP smoke, one ownership assertion,
one teardown. Heavier macOS-specific coverage can be filed if this
proves stable.

Docker setup: `brew install colima docker docker-compose` then
`colima start --cpu 4 --memory 8 --disk 30`. macOS GH-hosted runners
have brew preinstalled but no Docker (Apple's license forbids
Docker Desktop on hosted runners; colima is unrestricted).

Manual trigger after this lands: GitHub Actions UI → openemr-cmd
e2e workflow → "Run workflow" → select branch=master. Daily run
fires automatically at 06:17 UTC per the existing schedule trigger.

Assisted-by: Claude Code
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a81cd576-1c2e-4cc4-88fa-938b3a469bfd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The workflow adds a macOS-14 colima-backed smoke job for openemr-cmd, with stack startup, health polling, HTTP verification, bind-mount ownership checks, teardown, and failure diagnostics.

Changes

macOS smoke workflow

Layer / File(s) Summary
macOS smoke job
.github/workflows/test-bats-openemr-cmd-real-docker.yml
The workflow note names macos-smoke-e2e, and the job runs on macos-14 with schedule/manual gating and current PR runs.
macOS runtime setup
.github/workflows/test-bats-openemr-cmd-real-docker.yml
The job installs openemr-cmd, checks out openemr, reports runner resources, installs Docker, colima, and QEMU, and starts colima with QEMU and sized CPU, memory, and disk settings.
Smoke checks
.github/workflows/test-bats-openemr-cmd-real-docker.yml
The job brings up openemr/docker/development-easy, waits for development-easy-openemr-1 to become healthy, performs repeated HTTP checks on port 8300, verifies sqlconf.php ownership against the runner UID, and emits container and colima diagnostics on failure.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • openemr/openemr-devops#843: Updates the same workflow with a similar openemr-cmd up → health wait → HTTP smoke → ownership verification flow.
  • openemr/openemr-devops#831: Adds another non-worktree openemr-cmd lifecycle job in the same workflow, with overlapping smoke and teardown steps.
  • openemr/openemr-devops#818: Extends the same real-docker workflow with a macOS smoke job for openemr-cmd E2E coverage.

Poem

🐇 I hopped on macOS under starlit skies,
openemr-cmd woke up with sleepy eyes.
I sniffed port 8300, warm and bright,
Then checked the files were owned just right.
A tidy burrow, clean and new—
Hooray for smoke tests, and carrots too! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding a macOS smoke E2E job for openemr-cmd via colima.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

…efore merge

Adds pull_request to the if-guard so this PR's CI actually runs the
new job and we can see end-to-end whether colima + the openemr-cmd
happy path + HOST_UID adoption all work on a macos-14 runner. Without
this, the schedule-only guard would prevent the job from running on
the PR and we'd be merging blind.

Revert this commit (or just drop the pull_request clause from the
if-guard) in the final pre-merge commit on this branch. The PR
description should mention this temporary state.

Assisted-by: Claude Code

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-bats-openemr-cmd-real-docker.yml:
- Around line 1717-1721: Remove the temporary pull_request trigger from the job
condition so the macOS smoke job remains schedule/manual-only before merge.
Update the workflow guard in test-bats-openemr-cmd-real-docker so the if
expression only allows schedule and workflow_dispatch, and make sure the inline
TEMPORARY comment matches that final behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5c697d4f-4deb-46cd-acbe-bd9fae38990f

📥 Commits

Reviewing files that changed from the base of the PR and between 8168856 and ba31e24.

📒 Files selected for processing (1)
  • .github/workflows/test-bats-openemr-cmd-real-docker.yml

Comment on lines +1717 to +1721
# TEMPORARY: include pull_request so this PR can validate the job
# actually works on a macOS runner before merging. Revert to
# schedule-only (drop the pull_request clause) in the final
# pre-merge commit on this branch.
if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Drop the temporary pull_request clause before merge.

Line 1721 still enables this job on pull_request, which contradicts both the PR objective and the inline contract that macOS smoke should stay schedule/manual-only so PR cycles remain fast.

Suggested fix
-    if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' }}
+    if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# TEMPORARY: include pull_request so this PR can validate the job
# actually works on a macOS runner before merging. Revert to
# schedule-only (drop the pull_request clause) in the final
# pre-merge commit on this branch.
if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' }}
# TEMPORARY: include pull_request so this PR can validate the job
# actually works on a macOS runner before merging. Revert to
# schedule-only (drop the pull_request clause) in the final
# pre-merge commit on this branch.
if: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-bats-openemr-cmd-real-docker.yml around lines 1717 -
1721, Remove the temporary pull_request trigger from the job condition so the
macOS smoke job remains schedule/manual-only before merge. Update the workflow
guard in test-bats-openemr-cmd-real-docker so the if expression only allows
schedule and workflow_dispatch, and make sure the inline TEMPORARY comment
matches that final behavior.

…failure

First run failed at colima start (30s in) with vague "error starting vm".
The VZ driver "Expanding to 20GiB" then "Converting datadisk to a raw
disk" — likely ran out of host disk space. macos-14 runners ship with
~14GB free; default colima disk is 60GB and my initial --disk 30 was
also too large.

Conservative resources:
- --cpu 4 (was 4)
- --memory 6 (was 8) — leaves headroom on the runner
- --disk 12 (was 30) — fits the ~14GB free macos-14 budget

Also added a pre-colima diagnostic step showing free disk + memory,
and a fallback that dumps colima's hidden VM logs (ha.stderr.log,
serial.log) if start fails. The original error was opaque because
the hint pointed at a log file we never read; this fixes that.

If 12GB disk turns out too tight for image pulls + composer + npm,
next iteration moves to crazy-max/ghaction-setup-docker action
which handles macOS quirks more robustly.

Assisted-by: Claude Code
…ation on GH macos)

Real error from the previous diagnostic dump:
  VZErrorDomain Code=2 "Virtualization is not available on this hardware"

GitHub-hosted macOS runners don't support nested virtualization, so
Apple's Virtualization framework (default `--vm-type vz` on macOS
arm64) can't start a Lima VM. Switch to `--vm-type qemu` which uses
software emulation and works without nested virt.

Tradeoff: QEMU is slower than VZ. Combined with already-needed amd64
container emulation (openemr images are amd64; runner is Apple
Silicon arm64), expect stack startup to push 20-30 min. The 30-min
inner healthy-poll + 60-min outer job timeout absorb that.

Disk was a red herring — the diagnostic showed 39GB free, well
above my 12GB cap. Keeping the cap since it's also a good safety
net on smaller runner variants.

Assisted-by: Claude Code
Previous run got past the VZ-not-available error but hit the next
one: `qemu-img not found, run 'brew install qemu' to install`. The
colima brew package doesn't pull qemu in as a dependency. Add it
explicitly.

Assisted-by: Claude Code
@bradymiller bradymiller marked this pull request as draft June 26, 2026 09:07
Direct `colima` invocation on GH-hosted macos-14 hit two distinct
brick walls:

1. Default `vz` driver (Apple Virtualization framework) failed with
   "Virtualization is not available on this hardware" — GH runners
   don't support nested virtualization.

2. `--vm-type qemu` panicked in lima's host agent with "send on
   closed channel" (lima v2.1.2 goroutine race at
   qemu_driver.go:412 between SSH probe failure and goroutine
   cleanup). Reproducible across multiple runs.

crazy-max/ghaction-setup-docker is a purpose-built action for
setting up Docker on macOS (and Windows) GH runners. It uses Lima
under the hood with battle-tested config that avoids both of the
above. Used by many projects including buildx itself for macOS CI.

Removes the manual brew install of colima/docker/qemu — the action
handles everything.

Assisted-by: Claude Code
… can't run Linux VMs)

Iteration log on macos-14 (Apple Silicon, free GH-hosted runner):
1. colima default vz: VZErrorDomain "Virtualization is not available
   on this hardware" — no nested virt
2. colima --vm-type qemu (no qemu installed): "qemu-img not found"
3. colima --vm-type qemu (with qemu): Lima v2.1.2 goroutine race
   panic "send on closed channel" in qemu_driver.go:412
4. crazy-max/ghaction-setup-docker action: same QEMU "signal: abort
   trap" — Lima underneath, same fundamental issue

The free macos-14 pool fundamentally can't run Linux VMs reliably.
macos-13 (Intel) uses HVF directly on Intel hardware without nested-
virt requirements, and openemr images are amd64 so no qemu-user
translation needed either.

macos-13 is being phased out by GitHub, but should work for now.
When it's deprecated, options are: pay for macos-14-large (has
nested virt), self-hosted macOS runner, or drop the test.

Assisted-by: Claude Code
@bradymiller

Copy link
Copy Markdown
Member Author

Pivoting to arm64 Linux runners (ubuntu-24.04-arm) instead — see follow-up PR. The macOS approach hit a fundamental wall (free GH macos-14 can't run Linux VMs reliably; macos-13 Intel pool is being phased out + queues for 10+ min). arm64 Linux gives us the actual signal (arm-architecture image verification) without the macOS-VM brittleness, since Docker on Linux uses native containerization regardless of CPU arch.

@bradymiller bradymiller deleted the ci/macos-smoke-daily branch June 26, 2026 09:26
bradymiller added a commit that referenced this pull request Jun 26, 2026
…+ workflow_dispatch) (#845)

## Summary

Replaces the abandoned macOS smoke attempt (#844, closed) with an arm64
Linux runner. Adds one nonworktree-style stack on `ubuntu-24.04-arm` to
validate the openemr image's arm64 manifest works end-to-end.

## Why arm64 Linux instead of macOS

Free GH-hosted `macos-14` (Apple Silicon arm64) **fundamentally can't
run Linux VMs reliably** on the free pool — VZ driver needs nested
virtualization (not available), QEMU panics in Lima's host agent with
abort traps. After 4 iterations trying colima/Lima, abandoned the macOS
path.

`ubuntu-24.04-arm` gives us the same signal we actually wanted:
**arm64-specific image regressions**. The openemr image is multi-arch
(amd64 + arm64). Docker auto-selects the arm64 manifest on arm64 hosts.
Apple Silicon Mac developers pull the arm64 variant via Docker Desktop's
Linux VM, so catching arm64-only image bugs here within 24h saves them
from finding them first.

Linux arm64 runners use native cgroups/namespaces on an arm64 host
kernel — same fast/supported path as `ubuntu-22.04` for amd64, just on
arm64. **Zero virtualization, no colima brittleness**.

## Scope (intentionally narrow vs the amd64 jobs)

| Job | amd64 (existing) | arm64 (new) |
|---|---|---|
| worktree lifecycle | ✓ | — |
| multi-concurrent (4 stacks) | ✓ | — |
| nonworktree lifecycle | ✓ | **✓ (this job)** |
| functional round-trip | ✓ | — |
| prek install + real commit | ✓ | — |

One stack, one health check, one HTTP smoke, one bind-mount ownership
assertion, one teardown. Heavier arm64 coverage can be filed if this
smoke ever catches a real regression.

## What this catches that amd64-only doesn't

- arm64-specific image regressions in `openemr/openemr:flex` (missing
arm64 deps, arm64-only segfaults, etc.)
- HOST_UID entrypoint adoption working in the arm64 image variant
(verified by the bind-mount ownership assertion — same probe as the
amd64 nonworktree job, but exercises the arm64 entrypoint specifically)

## What this does NOT catch (out of scope)

- macOS-specific concerns: HFS+/APFS case-insensitivity, BSD coreutils
edges, Docker Desktop bind-mount semantics. Most are covered at the
script level by the bats portability memory note; the rest are "would be
nice if a contributor reports a bug" rather than CI requirements.

## Cost

~25 min outer timeout. Public-repo arm64 runners are free as of
mid-2024.

## Test plan

- [ ] Workflow YAML validates (verified locally)
- [ ] arm64 runner picks up the job within reasonable time (free arm64
Linux pool is much larger than macOS)
- [ ] Image pull succeeds (arm64 manifest selected automatically)
- [ ] Container reaches healthy
- [ ] HTTP smoke passes
- [ ] Bind-mount-ownership assertion passes (HOST_UID adoption working
on arm64 entrypoint)
- [ ] Teardown leaves no orphans

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Tests**
* Added a new arm64 smoke test job to improve validation of the app on
ARM Linux.
* The job checks service readiness, basic HTTP availability, and
confirms file ownership behaves correctly in the environment.
* Enhanced failure diagnostics to make setup issues easier to identify.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant