a3s-box is daemonless: a detached box (run -d) is supervised by a separate
a3s-box monitor process that polls boxes.json, restarts dead boxes per their
--restart policy, and acts on health checks (with exponential backoff).
On its own the monitor is just a foreground process. If it isn't running,
detached boxes silently lose restart + health supervision — nothing restarts a
crashed box, and status can stay starting. Running it by hand
(a3s-box monitor &) works until that shell or process dies.
a3s-box monitor --install # default 5s poll
a3s-box monitor --install --interval 10This installs and enables a per-user service — no root required:
| OS | Mechanism | Unit |
|---|---|---|
| Linux | systemd --user |
~/.config/systemd/user/a3s-box-monitor.service |
| macOS | launchd LaunchAgent | ~/Library/LaunchAgents/com.a3s-box.monitor.plist |
The OS keeps the monitor running and restarts it if it crashes
(Restart=always / KeepAlive). The unit's ExecStart points at the absolute
path of the a3s-box binary you ran --install with.
systemctl --user services stop when your login session ends. For a server that
should supervise boxes without an active login, enable lingering once:
loginctl enable-linger "$USER"a3s-box monitor --uninstallIf automatic enable/load fails (e.g. systemd/launchctl unavailable in your
environment), --install still writes the unit file and prints the manual
systemctl --user enable --now … / launchctl load -w … command to finish.
The monitor is the one always-on process and already polls every box's state,
so it can also export operator-scrapable observability. Pass --metrics-addr:
a3s-box monitor --metrics-addr 127.0.0.1:9100It then serves (plain HTTP, off by default, bind loopback — there is no auth):
| Path | Response |
|---|---|
GET /healthz |
200 ok while the poll loop is alive; 503 if it has not completed a poll within the staleness threshold (3 × interval, min 30s) — so a hung supervision loop fails the probe instead of lying |
GET /metrics |
Prometheus text of box-state + monitor-liveness metrics |
Exported metrics (read fresh from state per scrape, no side effects):
a3s_box_total # boxes tracked
a3s_box_state{status="running"|"paused"|"dead"|"created"|"other"}
a3s_box_restarts_total # sum of per-box restart counts
a3s_box_health{status="healthy"|"unhealthy"}
a3s_box_monitor_up # 1 if the poll loop polled within the threshold
a3s_box_monitor_seconds_since_last_poll
To run the metrics endpoint under the installed service, append the flag to the
unit's ExecStart/ProgramArguments (or re-run --install after we wire a
--metrics-addr passthrough). A Prometheus scrape config simply targets
127.0.0.1:9100.