diff --git a/plugins/orchestrator/skills/backup-plugin-db/SKILL.md b/plugins/orchestrator/skills/backup-plugin-db/SKILL.md new file mode 100644 index 0000000..c439564 --- /dev/null +++ b/plugins/orchestrator/skills/backup-plugin-db/SKILL.md @@ -0,0 +1,360 @@ +--- +name: backup-plugin-db +description: Use when setting up the orchestrator plugin on a new machine (or refreshing an existing install) and the user wants nightly point-in-time snapshots of the plugin SQLite DB. Installs a WAL-safe snapshot script and a daily scheduler (systemd-user, cron, or Windows Scheduled Task) that writes to a user-chosen destination - typically a cloud-sync folder or a path on backed-up local storage. +--- + +# Install nightly snapshots of the orchestrator plugin DB(s) + +## Overview + +The orchestrator plugin uses **two** local SQLite databases: + +| DB | Path | Holds | +|---|---|---| +| Global | `~/.claude/orchestrator/global.db` | Cross-project notes, user patterns, global state | +| Project | `/.orchestrator/project.db` | Per-project notes, decisions, work items, ADRs, embeddings | + +In a typical install most of the volume lives in the project DB(s) — one +per project where Claude is invoked. If either file is corrupted, +accidentally deleted, or the machine dies, the knowledge base goes with +it. **Both files should be backed up.** Many users discover only after +losing data that they were snapshotting the global DB and ignoring the +project DBs that hold the bulk of their knowledge. + +This skill installs a daily snapshot per DB. Run the helper once for the +global DB and once per project DB you care about. Each install gets its +own scheduler entry, its own destination subpath, and its own retention +policy. + +The snapshot pipeline is: + +- **WAL-safe**: uses SQLite's online backup API, so concurrent writes from + the running MCP server don't tear the snapshot. +- **Atomic**: writes to a tempfile in the destination dir and renames into + place, so cloud-sync clients never see a partial file. +- **User-chosen destination**: the script never invents a default. You pick + where snapshots go. Typical choices: + - A cloud-sync folder (OneDrive / Dropbox / Google Drive / iCloud / Syncthing) + - A path on a backed-up local disk + - A network share / NAS mount +- **Per-platform scheduler**: systemd-user timer on Linux/WSL/macOS-with-systemd, + cron fallback otherwise, Windows Scheduled Task on Windows. + +The snapshots themselves are safe to cloud-sync (they're frozen point-in-time +files). The **live DB at `~/.claude/orchestrator/global.db` is NOT safe to +cloud-sync** — SQLite file locking does not cross cloud-sync boundaries and +the WAL sidecar desyncs from the main file. Snapshot, don't mirror. + +## When to use + +- After installing the orchestrator plugin on a new machine +- After moving the plugin install to a new project / new home directory +- When the user wants to recover from a corrupted or deleted DB and there's + no existing snapshot strategy +- As routine hygiene if the existing schedule needs to be re-pointed at a + new destination + +## Prerequisites + +- Python 3.10+ on PATH. Verify: `python3 --version` (Linux/macOS) or `python --version` (Windows). +- **Windows note:** if `python --version` prints `Python was not found...`, + you're hitting the Microsoft Store App Execution Alias stub, not a real + Python. Install via `winget install --id Python.Python.3.12`, the + Microsoft Store *Python 3.x* app, or python.org — then open a new + PowerShell so PATH refreshes. +- A writable destination directory (cloud-sync folder, backed-up local + disk, NAS mount, etc.). +- Linux/macOS: either `systemctl --user` working OR `crontab` available + (helper auto-selects). + +## Steps + +### 1. Pick a destination + +Ask the user where they want snapshots written. The destination must be: + +- An existing directory (the helper will not create it) +- Writable by the current user +- NOT inside `~/.claude/orchestrator/` (the script refuses to write a + snapshot next to the live DB it's snapshotting) + +Cloud-drive examples to suggest (adapt to the user's setup): + +| Platform | Typical cloud-drive paths | +|---|---| +| Windows | `C:\Users\\OneDrive\` · `C:\Users\\Dropbox\` | +| macOS | `~/Library/CloudStorage/OneDrive-Personal/` · `~/Dropbox/` · `~/Library/Mobile Documents/com~apple~CloudDocs/` | +| Linux | `~/Dropbox/` · `~/OneDrive/` (if rclone/onedrive client mounted) · `~/Syncthing/` | +| WSL2 | `/mnt/c/Users//OneDrive/` · any path on the Windows side | + +These are suggestions only. The user is in charge of where snapshots land. + +### 2. Locate the source scripts directory + +This skill's helper scripts live in the `scripts/` subdir next to this +SKILL.md. The system message that loaded this skill displayed a +`Base directory for this skill:` header with the absolute path. Use that +path's `scripts/` subdirectory. + +If the base dir isn't surfaced, the plugin cache layout is: + +```bash +SCRIPTS_DIR=$(find ~/.claude/plugins/cache -path "*/orchestrator/*/skills/backup-plugin-db/scripts" -type d 2>/dev/null | sort | tail -1) +echo "$SCRIPTS_DIR" +``` + +### 3. Install one timer for the global DB + +**Linux / WSL / macOS (bash):** + +```bash +"$SCRIPTS_DIR/install-snapshot-timer.sh" \ + --cloud-root /path/to/destination \ + --retain-days 30 +``` + +(`--retain-days` is optional; omit to keep snapshots forever.) + +**Windows (PowerShell):** + +```powershell +& "$SCRIPTS_DIR\install-snapshot-task.ps1" ` + -CloudRoot 'C:\Users\me\OneDrive\plugin-backups' ` + -RetainDays 30 +``` + +This snapshots `~/.claude/orchestrator/global.db` daily and writes to +`//global-YYYY-MM-DD.db`. + +### 4. Install one timer per project DB you want backed up + +For each project whose `.orchestrator/project.db` you want preserved, +run the helper again with `--source` (or `-Source`) and a distinct +`--name` (or `-TaskName`). Examples: + +**Linux / WSL / macOS:** + +```bash +"$SCRIPTS_DIR/install-snapshot-timer.sh" \ + --cloud-root /path/to/destination \ + --source /home/me/repos/myproject/.orchestrator/project.db \ + --name claude-db-snapshot-myproject \ + --retain-days 90 +``` + +**Windows:** + +```powershell +& "$SCRIPTS_DIR\install-snapshot-task.ps1" ` + -CloudRoot 'C:\Users\me\OneDrive\plugin-backups' ` + -Source 'D:\repos\myproject\.orchestrator\project.db' ` + -TaskName 'Claude DB nightly (myproject)' ` + -RetainDays 90 +``` + +Each install lands at `//project-YYYY-MM-DD.db` +(the filename uses the source's stem, so multiple project DBs from +different projects WILL collide if you point them at the same +``). For multi-project setups, give each install a distinct +`--cloud-root` subdir (e.g. `/myproject/`) so the filenames +don't fight. + +**Retention** (`--retain-days N` / `-RetainDays N`): after each +snapshot, the script deletes only files matching this same source's +`-YYYY-MM-DD.db` pattern that are older than N days. Files that +don't match the pattern are never touched. Omit the flag entirely to +keep all snapshots forever. Different DBs can have different retention +windows (e.g. 30 days for global, 90 days for a project you're actively +working on). + +### 3a. Scheduler details + +**bash helper:** auto-detects systemd-user. If present and the user bus +is alive, it installs `~/.config/systemd/user/.{service,timer}` +and runs `systemctl --user enable --now .timer`. Otherwise it +falls back to a user crontab entry tagged with the chosen `--name` +(idempotent — re-running replaces the prior entry). + +**WSL2 / headless Linux note:** systemd-user timers stop firing when +the user logs out unless linger is enabled. The helper prints a warning +if Linger is OFF. To enable: + +```bash +sudo loginctl enable-linger $(id -un) +``` + +**Windows helper:** registers a Scheduled Task with the given +`-TaskName`. It runs as the current user when logged in. Snapshots are +skipped on days when the user never logs in (the helper output shows +how to switch to `LogonType S4U` if always-run is required). + +### 5. Verify the install + +**One-off snapshot (sanity check):** + +```bash +# bash +python3 "$SCRIPTS_DIR/snapshot-plugin-db.py" --cloud-root /path/to/destination +``` + +```powershell +# PowerShell +& pyw.exe "$SCRIPTS_DIR\snapshot-plugin-db.py" --cloud-root 'C:\path\to\destination' +``` + +Expected output (paths will differ): + +``` +[snapshot-plugin-db] source: /home//.claude/orchestrator/global.db +[snapshot-plugin-db] dest: /path/to/destination//global-YYYY-MM-DD.db +[snapshot-plugin-db] wrote 475,136 bytes +``` + +**Confirm the scheduler is armed:** + +```bash +# systemd-user +systemctl --user list-timers claude-orchestrator-db-snapshot.timer + +# cron +crontab -l | grep claude-orchestrator-db-snapshot +``` + +```powershell +# Windows +Get-ScheduledTask -TaskName 'Claude orchestrator DB snapshot' | Get-ScheduledTaskInfo +``` + +### 6. (Optional) Run a read-back drill + +A snapshot is only useful if it's restorable. Open the snapshot read-only +and confirm it has the expected shape: + +```bash +sqlite3 /path/to/destination//global-YYYY-MM-DD.db \ + "PRAGMA integrity_check; SELECT count(*) FROM notes;" +``` + +`integrity_check` should print `ok`. The note count should be non-zero and +close to what `lookup` reports against the live DB. + +For a full destructive restore drill, see "Common mistakes" below. + +### 7. (Optional) Wire up failure alerting + +By default, a failing snapshot is visible but not loud: + +- **systemd-user:** `systemctl --user --failed` lists it; `journalctl --user -u .service` has the traceback. +- **cron:** stderr lands wherever your MTA delivers root cron mail, or in the per-user spool. +- **Windows Scheduled Task:** the task's `LastTaskResult` is non-zero; Event Viewer → Task Scheduler logs the run. + +The shipped units stay narrow on purpose — alerting is opinionated and per-environment. If you already run a failure handler, attach it without editing the shipped unit: + +```bash +# systemd-user: drop-in override (replace with your --name slug, +# and @.service with your own template). +mkdir -p ~/.config/systemd/user/.service.d +cat > ~/.config/systemd/user/.service.d/onfailure.conf <<'EOF' +[Unit] +OnFailure=@%n.service +EOF +systemctl --user daemon-reload +``` + +```powershell +# Windows: attach a follow-up action to the task without re-registering it. +$task = Get-ScheduledTask -TaskName '' +$task.Settings.RestartCount = 1 +$task.Settings.RestartInterval = 'PT1M' +Set-ScheduledTask -InputObject $task +# For richer alerting, layer a second Scheduled Task triggered by the +# "Task failed to start" or "Task completed with non-zero result" event. +``` + +## Quick reference + +| Step | What | Where | +|---|---|---| +| 1 | Pick a destination directory (user's choice) | `` | +| 2 | Locate scripts dir | `/scripts/` | +| 3 | Install timer for global DB (`--retain-days` optional) | systemd-user / cron / Task Scheduler | +| 4 | Install one timer per project DB (`--source` + distinct `--name`) | same | +| 5 | One-off `snapshot-plugin-db.py --cloud-root [--source ...]` | Confirms script runs | +| 6 | `sqlite3 "PRAGMA integrity_check"` (or Python) | Confirms snapshot is valid | + +## Common mistakes + +- **Backing up only the global DB.** This is the most common omission. + Most volume lives in `/.orchestrator/project.db`. Run the + helper a second time with `--source ` and a distinct + `--name`. +- **Pointing `--cloud-root` at `~/.claude/orchestrator/`** or a parent of + it. The script refuses; pick a destination outside the source DB's + directory. +- **Collisions between project DBs in the same destination.** Two + different projects both named `project.db` will both produce + `project-YYYY-MM-DD.db` filenames. Either use distinct `--cloud-root` + subdirs per project, or rename one source DB before snapshotting. +- **Cloud-syncing the live DB itself.** Don't. SQLite locking does not + cross sync boundaries; the WAL sidecar desyncs from the main file; you'll + end up with `-conflict-.db` files and corrupted state. + Snapshot to the cloud; never mirror the live file. +- **Forgetting `loginctl enable-linger` on WSL/server.** Without linger, + systemd-user timers stop when you log out. The helper warns but doesn't + enable it (requires sudo). +- **Re-running with a different `--name` and expecting the old timer to be + cleaned up.** The helper's idempotency is keyed on the name. To replace + the destination of an existing schedule, re-run with the same `--name` + and the new `--cloud-root`. To run multiple schedules in parallel + (e.g. two destinations), use distinct names. +- **Skipping verification.** Run step 4 immediately after install. A + scheduled task that fires every night and silently writes to nowhere is + the worst outcome. + +## Destructive restore drill (optional, recommended quarterly) + +The read-back check in step 5 confirms the snapshot is valid SQLite. To +prove it's a working restoration, swap it in: + +```bash +# 1. Stop the running orchestrator MCP (close all Claude Code sessions +# that have the plugin loaded, or kill the MCP process directly). +# 2. Move the live DB aside. +mv ~/.claude/orchestrator/global.db{,.predrill-backup} +# 3. Restore from the latest snapshot. +cp /path/to/destination//global-YYYY-MM-DD.db ~/.claude/orchestrator/global.db +# 4. Start a new Claude Code session; it will spin up the MCP fresh. +# 5. Confirm via `lookup` / briefing that the expected notes are present. +# 6. If happy: rm ~/.claude/orchestrator/global.db.predrill-backup +# If not: mv ~/.claude/orchestrator/global.db{.predrill-backup,} +``` + +The orchestrator MCP server runs schema migrations on startup, so a +snapshot from an older plugin version restores cleanly under a newer +plugin version. + +## Notes + +- **Per-machine by design.** Each machine maintains its own snapshot + schedule writing to its own `/` subdir. SQLite is not a + multi-machine sync target. +- **Retention is per-install.** Pass `--retain-days N` to delete only + this source's `-YYYY-MM-DD.db` snapshots older than N days. Files + that don't match the pattern are never touched. Omit the flag to keep + forever. Different DBs can have different retention windows by passing + different values at install time. Same-day re-runs overwrite the + current day's file atomically (the snapshot writes to a tempfile and + atomic-renames into place). +- **Source override.** `--source ` works if you've relocated the + plugin DB or want to snapshot a non-default install. +- **Same-day catch-up.** `Persistent=true` on the systemd timer (and + `-StartWhenAvailable` on the Windows task) means a missed run fires on + next boot. The atomic write keeps any pre-existing same-day snapshot + safe until the new one is fully written. +- **`--flat`** drops the `/` subdir. Useful only if you really + intend to mingle snapshots from multiple machines in one folder; not + recommended. +- **Cleanup.** To uninstall, disable the timer (`systemctl --user disable + --now .timer`) and remove the unit files, drop the crontab entry, + or `Unregister-ScheduledTask -TaskName '' -Confirm:$false` on + Windows. diff --git a/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-task.ps1 b/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-task.ps1 new file mode 100644 index 0000000..fb65430 --- /dev/null +++ b/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-task.ps1 @@ -0,0 +1,161 @@ +<# +.SYNOPSIS +Register a Windows Scheduled Task that snapshots an orchestrator plugin DB nightly. + +.DESCRIPTION +Idempotent on -TaskName: re-running with the same name replaces the prior task. +The task runs the bundled snapshot-plugin-db.py via pyw.exe (windowless) when +present, falling back to python.exe. Runs as the current user when logged in; +snapshots are skipped on days when the user never logs in. + +Run this helper once per DB you want backed up. The plugin uses two DBs: + 1. ~/.claude/orchestrator/global.db (default -Source) + 2. /.orchestrator/project.db (pass -Source explicitly) +Give each install a distinct -TaskName so they don't replace each other. + +.PARAMETER CloudRoot +Destination directory for snapshots. Must exist. Required. + +.PARAMETER Source +Source DB file. Default: '~/.claude/orchestrator/global.db' (resolved at runtime +on Windows via $env:USERPROFILE). Override to back up a project DB. + +.PARAMETER Time +Daily snapshot time, HH:mm, local TZ. Default: 04:07. + +.PARAMETER TaskName +Scheduled Task display name. Default: "Claude orchestrator DB snapshot". + +.PARAMETER RetainDays +Optional. After each run, delete snapshots for this same source older than N +days. Only files matching the source's -YYYY-MM-DD.db pattern in the +destination directory are eligible. Omit to keep all snapshots forever. + +.EXAMPLE +.\install-snapshot-task.ps1 -CloudRoot 'C:\Users\me\OneDrive\plugin-backups' + +.EXAMPLE +.\install-snapshot-task.ps1 -CloudRoot 'D:\backups\claude' ` + -Source 'D:\repos\myproj\.orchestrator\project.db' ` + -TaskName 'Claude DB nightly (myproj)' -RetainDays 30 +#> +[CmdletBinding()] +param( + [Parameter(Mandatory = $true)] + [string] $CloudRoot, + + [Parameter(Mandatory = $false)] + [string] $Source = '', + + [Parameter(Mandatory = $false)] + [ValidatePattern('^([01][0-9]|2[0-3]):[0-5][0-9]$')] + [string] $Time = '04:07', + + [Parameter(Mandatory = $false)] + [string] $TaskName = 'Claude orchestrator DB snapshot', + + [Parameter(Mandatory = $false)] + [ValidateRange(1, 36500)] + [int] $RetainDays = 0 +) + +$ErrorActionPreference = 'Stop' + +$scriptDir = $PSScriptRoot +$pyScript = Join-Path $scriptDir 'snapshot-plugin-db.py' + +if (-not (Test-Path -LiteralPath $pyScript -PathType Leaf)) { + throw "snapshot script not found next to this helper: $pyScript" +} + +$resolvedCloudRoot = Resolve-Path -LiteralPath $CloudRoot -ErrorAction SilentlyContinue +if (-not $resolvedCloudRoot -or -not (Test-Path -LiteralPath $resolvedCloudRoot -PathType Container)) { + throw "-CloudRoot does not exist or is not a directory: $CloudRoot" +} +$resolvedCloudRoot = $resolvedCloudRoot.Path + +# Prefer pyw.exe (no console window). Fall back to python.exe. +$pyExe = (Get-Command pyw.exe -ErrorAction SilentlyContinue).Source +if (-not $pyExe) { + $pyExe = (Get-Command python.exe -ErrorAction SilentlyContinue).Source +} +if (-not $pyExe) { + throw "Neither pyw.exe nor python.exe found in PATH. Install Python 3.8+ and retry." +} + +$resolvedSource = '' +if ($Source) { + $sourcePath = Resolve-Path -LiteralPath $Source -ErrorAction SilentlyContinue + if (-not $sourcePath -or -not (Test-Path -LiteralPath $sourcePath -PathType Leaf)) { + throw "-Source does not exist or is not a file: $Source" + } + $resolvedSource = $sourcePath.Path +} + +Write-Host "[install-snapshot-task] python: $pyExe" +Write-Host "[install-snapshot-task] script: $pyScript" +Write-Host "[install-snapshot-task] cloud-root: $resolvedCloudRoot" +Write-Host "[install-snapshot-task] source: $(if ($resolvedSource) { $resolvedSource } else { '' })" +Write-Host "[install-snapshot-task] time: $Time" +Write-Host "[install-snapshot-task] task name: $TaskName" +Write-Host "[install-snapshot-task] retain-days: $(if ($RetainDays -gt 0) { $RetainDays } else { '' })" +Write-Host "" + +# Quote each argument that may contain spaces. +$pyArgs = @( + '"{0}"' -f $pyScript + '--cloud-root' + '"{0}"' -f $resolvedCloudRoot +) +if ($resolvedSource) { + $pyArgs += '--source' + $pyArgs += '"{0}"' -f $resolvedSource +} +if ($RetainDays -gt 0) { + $pyArgs += '--retain-days' + $pyArgs += $RetainDays.ToString() +} +$argList = $pyArgs -join ' ' + +$action = New-ScheduledTaskAction -Execute $pyExe -Argument $argList +$trigger = New-ScheduledTaskTrigger -Daily -At $Time +$settings = New-ScheduledTaskSettingsSet ` + -StartWhenAvailable ` + -AllowStartIfOnBatteries ` + -DontStopIfGoingOnBatteries ` + -MultipleInstances IgnoreNew +$principal = New-ScheduledTaskPrincipal ` + -UserId ([System.Security.Principal.WindowsIdentity]::GetCurrent().Name) ` + -LogonType Interactive ` + -RunLevel Limited + +# Replace if already registered. +$existing = Get-ScheduledTask -TaskName $TaskName -ErrorAction SilentlyContinue +if ($existing) { + Unregister-ScheduledTask -TaskName $TaskName -Confirm:$false +} + +Register-ScheduledTask ` + -TaskName $TaskName ` + -Action $action ` + -Trigger $trigger ` + -Settings $settings ` + -Principal $principal ` + -Description 'Nightly snapshot of the Claude orchestrator plugin DB.' | Out-Null + +Write-Host "Registered scheduled task: $TaskName" +Write-Host "" +Write-Host "Verify with:" +Write-Host " Get-ScheduledTask -TaskName '$TaskName'" +Write-Host " Get-ScheduledTaskInfo -TaskName '$TaskName'" +Write-Host "" +Write-Host "One-off verification (run the snapshot now):" +$verifyArgs = @("'$pyScript'", "--cloud-root", "'$resolvedCloudRoot'") +if ($resolvedSource) { $verifyArgs += @('--source', "'$resolvedSource'") } +if ($RetainDays -gt 0) { $verifyArgs += @('--retain-days', $RetainDays.ToString()) } +Write-Host " & '$pyExe' $($verifyArgs -join ' ')" +Write-Host "" +Write-Host "Note: this task runs only while you are logged in. Snapshots are skipped on" +Write-Host " days when you never log in. For run-while-logged-out behavior, switch" +Write-Host " the principal to -LogonType S4U (requires no stored password but only" +Write-Host " works on domain-joined or appropriately-configured machines)." diff --git a/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-timer.sh b/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-timer.sh new file mode 100755 index 0000000..4f7bd8a --- /dev/null +++ b/plugins/orchestrator/skills/backup-plugin-db/scripts/install-snapshot-timer.sh @@ -0,0 +1,239 @@ +#!/usr/bin/env bash +# +# Install a nightly snapshot timer for an orchestrator plugin SQLite DB. +# +# Detects systemd-user; falls back to cron if unavailable. Idempotent on +# --name: re-running with the same name replaces the prior install. +# +# Run this helper once per DB you want backed up. The plugin uses two DBs: +# 1. ~/.claude/orchestrator/global.db (default --source) +# 2. /.orchestrator/project.db (pass --source explicitly) +# Give each install a distinct --name so they don't replace each other. +# +# Usage: +# ./install-snapshot-timer.sh --cloud-root /path/to/destination +# [--source /path/to/db] +# [--time HH:MM] +# [--name ] +# [--retain-days N] +# +# Required: +# --cloud-root PATH Destination directory for snapshots. +# +# Optional: +# --source PATH Source DB. Default: ~/.claude/orchestrator/global.db. +# --time HH:MM Daily snapshot time, local TZ. Default: 04:07. +# --name SLUG Unit / cron tag name. Default: claude-orchestrator-db-snapshot. +# --retain-days N Delete snapshots older than N days (matching the same +# source's filename pattern) after each run. Default: +# keep forever. +# --help Show this help and exit. + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PYSCRIPT="$SCRIPT_DIR/snapshot-plugin-db.py" + +CLOUD_ROOT="" +SOURCE="" +TIME="04:07" +NAME="claude-orchestrator-db-snapshot" +RETAIN_DAYS="" + +usage() { + # Print the leading comment block of this file as the help text. + awk ' + NR == 1 { next } + /^#/ { sub(/^# ?/, ""); print; next } + { exit } + ' "${BASH_SOURCE[0]}" +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --cloud-root) CLOUD_ROOT="$2"; shift 2 ;; + --source) SOURCE="$2"; shift 2 ;; + --time) TIME="$2"; shift 2 ;; + --name) NAME="$2"; shift 2 ;; + --retain-days) RETAIN_DAYS="$2"; shift 2 ;; + --help|-h) usage; exit 0 ;; + *) echo "unknown arg: $1" >&2; usage; exit 2 ;; + esac +done + +if [[ -n "$RETAIN_DAYS" ]] && ! [[ "$RETAIN_DAYS" =~ ^[1-9][0-9]*$ ]]; then + echo "error: --retain-days must be a positive integer. Got: $RETAIN_DAYS" >&2 + exit 2 +fi + +if [[ -z "$CLOUD_ROOT" ]]; then + echo "error: --cloud-root is required." >&2 + usage + exit 2 +fi + +if [[ ! -d "$CLOUD_ROOT" ]]; then + echo "error: --cloud-root does not exist or is not a directory: $CLOUD_ROOT" >&2 + exit 2 +fi + +if [[ ! "$TIME" =~ ^([01][0-9]|2[0-3]):[0-5][0-9]$ ]]; then + echo "error: --time must be HH:MM (24h, zero-padded). Got: $TIME" >&2 + exit 2 +fi + +if [[ ! -f "$PYSCRIPT" ]]; then + echo "error: snapshot script not found next to this helper: $PYSCRIPT" >&2 + exit 1 +fi +chmod 0755 "$PYSCRIPT" + +PYTHON_BIN="$(command -v python3 || true)" +if [[ -z "$PYTHON_BIN" ]]; then + echo "error: python3 not found in PATH. Install Python 3.8+ and retry." >&2 + exit 1 +fi + +CLOUD_ROOT_ABS="$(cd "$CLOUD_ROOT" && pwd)" + +EXEC_ARGS=("$PYSCRIPT" "--cloud-root" "$CLOUD_ROOT_ABS") +if [[ -n "$SOURCE" ]]; then + if [[ ! -f "$SOURCE" ]]; then + echo "error: --source does not exist or is not a file: $SOURCE" >&2 + exit 2 + fi + SOURCE_ABS="$(cd "$(dirname "$SOURCE")" && pwd)/$(basename "$SOURCE")" + EXEC_ARGS+=("--source" "$SOURCE_ABS") +fi +if [[ -n "$RETAIN_DAYS" ]]; then + EXEC_ARGS+=("--retain-days" "$RETAIN_DAYS") +fi + +have_systemd_user() { + command -v systemctl >/dev/null 2>&1 || return 1 + systemctl --user --version >/dev/null 2>&1 || return 1 + # systemd-user can be installed but disabled (no user-bus); check that too. + [[ -S "${XDG_RUNTIME_DIR:-/run/user/$(id -u)}/bus" ]] +} + +shell_quote() { + # Single-quote-wrap a single argument for safe use in a systemd ExecStart + # or a shell command line. + printf "'%s'" "${1//\'/\'\\\'\'}" +} + +install_systemd() { + local unit_dir="${XDG_CONFIG_HOME:-$HOME/.config}/systemd/user" + local service="$unit_dir/${NAME}.service" + local timer="$unit_dir/${NAME}.timer" + + mkdir -p "$unit_dir" + + local exec_line="$PYTHON_BIN" + for arg in "${EXEC_ARGS[@]}"; do + exec_line="$exec_line $(shell_quote "$arg")" + done + + cat > "$service" < "$timer" </dev/null 2>&1; then + if ! loginctl show-user "$(id -un)" 2>/dev/null | grep -q '^Linger=yes'; then + echo "Note: linger is OFF for this user. Timers stop when you log out." + echo " Enable with: sudo loginctl enable-linger $(id -un)" + echo " (Especially relevant on WSL2 and headless servers.)" + fi + fi +} + +install_cron() { + local marker="# managed by install-snapshot-timer.sh: ${NAME}" + local minute="${TIME#*:}" + local hour="${TIME%:*}" + # Strip leading zeros for cron (it accepts them but some implementations + # have warned historically; safer to feed plain decimals). + minute=$((10#$minute)) + hour=$((10#$hour)) + local cmd="$PYTHON_BIN" + for arg in "${EXEC_ARGS[@]}"; do + cmd="$cmd $(shell_quote "$arg")" + done + local entry="${minute} ${hour} * * * ${cmd} ${marker}" + + local current new + current="$(crontab -l 2>/dev/null || true)" + # Drop any prior entry with our marker; append the new one. + new="$(printf '%s\n' "$current" | grep -v -F "$marker" || true)" + new="${new%$'\n'}" + if [[ -n "$new" ]]; then + new="${new}"$'\n'"${entry}" + else + new="${entry}" + fi + printf '%s\n' "$new" | crontab - + + echo + echo "Installed cron entry (marker: ${NAME}):" + echo " ${entry}" + echo + echo "Verify with:" + echo " crontab -l | grep ${NAME}" +} + +echo "[install-snapshot-timer] python: $PYTHON_BIN" +echo "[install-snapshot-timer] script: $PYSCRIPT" +echo "[install-snapshot-timer] cloud-root: $CLOUD_ROOT_ABS" +echo "[install-snapshot-timer] source: ${SOURCE:-}" +echo "[install-snapshot-timer] time: $TIME" +echo "[install-snapshot-timer] name: $NAME" +echo "[install-snapshot-timer] retain-days: ${RETAIN_DAYS:-}" +echo + +if have_systemd_user; then + echo "[install-snapshot-timer] scheduler: systemd-user" + install_systemd +else + echo "[install-snapshot-timer] scheduler: cron (no systemd-user available)" + install_cron +fi + +verify_cmd="$PYTHON_BIN" +for arg in "${EXEC_ARGS[@]}"; do + verify_cmd="$verify_cmd $(shell_quote "$arg")" +done +echo +echo "One-off verification (run the snapshot now):" +echo " $verify_cmd" diff --git a/plugins/orchestrator/skills/backup-plugin-db/scripts/snapshot-plugin-db.py b/plugins/orchestrator/skills/backup-plugin-db/scripts/snapshot-plugin-db.py new file mode 100755 index 0000000..640a53a --- /dev/null +++ b/plugins/orchestrator/skills/backup-plugin-db/scripts/snapshot-plugin-db.py @@ -0,0 +1,235 @@ +#!/usr/bin/env python3 +"""Take a WAL-safe, point-in-time snapshot of an orchestrator plugin SQLite DB. + +The orchestrator plugin uses TWO local SQLite databases: + + ~/.claude/orchestrator/global.db (cross-project) + /.orchestrator/project.db (per-project) + +Run this script once per DB you want backed up - one invocation snapshots +one source file via SQLite's online backup API. Default --source is the +global DB. Override with --source to snapshot a project DB. + +The snapshot lands at: + + //-YYYY-MM-DD.db + +where is the source filename without extension (e.g. +"global" or "project"). This keeps separate-DB snapshots from colliding +when they share a destination directory. + +The destination is entirely the user's choice. No defaults are inferred. +The script hard-fails unless --cloud-root is given or +$CLAUDE_ORCHESTRATOR_BACKUP_ROOT is set in the environment. Typical +destinations are a cloud-sync folder (OneDrive, Dropbox, Google Drive, +iCloud, Syncthing) or a path on backed-up local storage. See SKILL.md. + +WAL-safety: a naive file copy of the live DB risks copying the main file +while transactions still live in the -wal sidecar (the plugin keeps a +multi-megabyte WAL during normal operation). The online backup API +iterates pages under SQLite's own locking, so concurrent writes from the +running orchestrator MCP server are safe. + +The snapshot is written to a tempfile in the destination directory, then +atomically renamed into place. This prevents cloud-sync clients from +picking up a partial write, and prevents same-day catch-up runs from +corrupting an existing valid snapshot mid-write. + +Retention: --retain-days N (optional) deletes snapshots in the same +destination directory matching the same source-stem pattern that are +older than N days. Files that do not match the exact pattern are never +touched. Omit --retain-days to keep all snapshots forever. + +Exit non-zero on any failure (useful for systemd OnFailure= and Windows +Task Scheduler's "if the task fails" handlers). +""" + +from __future__ import annotations + +import argparse +import datetime +import os +import re +import socket +import sqlite3 +import sys +from pathlib import Path + + +DEFAULT_SOURCE = Path.home() / ".claude" / "orchestrator" / "global.db" +ENV_CLOUD_ROOT = "CLAUDE_ORCHESTRATOR_BACKUP_ROOT" + + +def resolve_cloud_root(override: Path | None) -> Path: + """Pick the destination root from --cloud-root, env, or fail explicitly.""" + if override is not None: + root = override + else: + env_val = os.environ.get(ENV_CLOUD_ROOT) + if not env_val: + raise SystemExit( + f"no backup destination configured. " + f"Pass --cloud-root or set ${ENV_CLOUD_ROOT}." + ) + root = Path(env_val) + if not root.is_dir(): + raise SystemExit( + f"cloud root not found or not a directory: {root}. " + f"Create it first (or fix ${ENV_CLOUD_ROOT})." + ) + return root + + +def build_dest_path( + cloud_root: Path, + hostname: str | None, + source_stem: str, + date: datetime.date, +) -> Path: + """Compose the snapshot destination path under cloud_root.""" + filename = f"{source_stem}-{date.isoformat()}.db" + if hostname: + return cloud_root / hostname / filename + return cloud_root / filename + + +def _assert_dest_outside_source(source: Path, dest: Path) -> None: + """Refuse to write the snapshot inside the source DB's own directory. + + Catches the footgun of pointing --cloud-root at ~/.claude/orchestrator (or + a parent of it) - the snapshot would land next to the live DB and could + be picked up as a competing SQLite file by the running MCP server. + """ + src_dir = source.resolve().parent + dest_dir = dest.resolve().parent + try: + dest_dir.relative_to(src_dir) + except ValueError: + return # dest_dir is NOT under src_dir, which is what we want + raise SystemExit( + f"refusing to write snapshot inside the source DB's directory: " + f"{dest_dir} is under {src_dir}. " + f"Pick a destination outside the source DB's directory." + ) + + +def snapshot(source: Path, dest: Path) -> None: + """Write a WAL-safe, atomic point-in-time copy of source to dest.""" + if not source.is_file(): + raise SystemExit(f"source DB not found: {source}") + _assert_dest_outside_source(source, dest) + dest.parent.mkdir(parents=True, exist_ok=True) + tmp = dest.with_name(f"{dest.name}.tmp.{os.getpid()}") + src = sqlite3.connect(f"file:{source}?mode=ro", uri=True) + try: + dst = sqlite3.connect(tmp) + try: + src.backup(dst) + finally: + dst.close() + finally: + src.close() + os.replace(tmp, dest) + + +def prune_old(dest_dir: Path, source_stem: str, retain_days: int, today: datetime.date) -> list[Path]: + """Delete snapshots in dest_dir matching -YYYY-MM-DD.db older than retain_days. + + Returns the list of files deleted. Files that do not match the exact + pattern are never touched. retain_days must be >= 1. + """ + if retain_days < 1: + raise SystemExit(f"--retain-days must be >= 1, got {retain_days}") + pattern = re.compile(rf"^{re.escape(source_stem)}-(\d{{4}}-\d{{2}}-\d{{2}})\.db$") + cutoff = today - datetime.timedelta(days=retain_days) + deleted: list[Path] = [] + if not dest_dir.is_dir(): + return deleted + for entry in dest_dir.iterdir(): + if not entry.is_file(): + continue + m = pattern.match(entry.name) + if not m: + continue + try: + file_date = datetime.date.fromisoformat(m.group(1)) + except ValueError: + continue + if file_date < cutoff: + entry.unlink() + deleted.append(entry) + return deleted + + +def main(argv: list[str] | None = None) -> int: + """Parse args, take the snapshot, optionally prune older snapshots.""" + p = argparse.ArgumentParser(description=__doc__.splitlines()[0]) + p.add_argument( + "--source", + type=Path, + default=DEFAULT_SOURCE, + help=f"source DB path (default: {DEFAULT_SOURCE})", + ) + p.add_argument( + "--cloud-root", + type=Path, + default=None, + help=( + f"destination root directory. Required unless ${ENV_CLOUD_ROOT} is set. " + "No default; pick whatever path you want snapshots written to " + "(typically a cloud-sync folder or backed-up local path)." + ), + ) + p.add_argument( + "--hostname", + default=socket.gethostname(), + help="hostname segment for the destination subdir (default: this machine's hostname)", + ) + p.add_argument( + "--flat", + action="store_true", + help="omit the / subdir; write directly under --cloud-root", + ) + p.add_argument( + "--date", + type=datetime.date.fromisoformat, + default=None, + help="override date for the filename (default: today, local TZ)", + ) + p.add_argument( + "--retain-days", + type=int, + default=None, + help=( + "after writing the snapshot, delete snapshots for this same source " + "older than N days. Only files matching -YYYY-MM-DD.db in the " + "destination directory are eligible. Omit to keep all snapshots forever." + ), + ) + args = p.parse_args(argv) + + cloud_root = resolve_cloud_root(args.cloud_root) + date = args.date if args.date else datetime.date.today() + hostname = None if args.flat else args.hostname + source_stem = args.source.stem + dest = build_dest_path(cloud_root, hostname, source_stem, date) + + print(f"[snapshot-plugin-db] source: {args.source}") + print(f"[snapshot-plugin-db] dest: {dest}") + snapshot(args.source, dest) + size = dest.stat().st_size + print(f"[snapshot-plugin-db] wrote {size:,} bytes") + + if args.retain_days is not None: + deleted = prune_old(dest.parent, source_stem, args.retain_days, date) + if deleted: + print(f"[snapshot-plugin-db] pruned {len(deleted)} snapshot(s) older than {args.retain_days} days:") + for path in deleted: + print(f" - {path.name}") + else: + print(f"[snapshot-plugin-db] retention: nothing to prune (keep-days={args.retain_days})") + return 0 + + +if __name__ == "__main__": + sys.exit(main())