tangle-network · github-actions · Jun 23, 2026
diff --git a/.changeset/credential-aware-default-provider.md b/.changeset/credential-aware-default-provider.md
diff --git a/.changeset/design-audit-content-fidelity.md b/.changeset/design-audit-content-fidelity.md
diff --git a/.changeset/job-first-redesign-engine.md b/.changeset/job-first-redesign-engine.md
diff --git a/.changeset/refgen-reasoning-token-headroom.md b/.changeset/refgen-reasoning-token-headroom.md
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,25 @@
 # @tangle-network/browser-agent-driver
 
+## 0.35.0
+
+### Minor Changes
+
+- [#122](https://github.com/tangle-network/browser-agent-driver/pull/122) [`b0f74a4`](https://github.com/tangle-network/browser-agent-driver/commit/b0f74a4f91a04517d988e95aa95ed0509bd2a26e) Thanks [@drewstone](https://github.com/drewstone)! - The default provider is now credential-aware instead of a hard `openai`. A bare run (no `--provider`/`--model`, no config-file provider) uses OpenAI when `OPENAI_API_KEY` is set — unchanged for existing users and CI — and otherwise falls back to an available provider (claude-code, which needs no key) rather than failing on a missing OpenAI key. An explicit provider in CLI flags or a config file is always honored, and the default model maps per-provider as before (e.g. gpt-5.4 → sonnet for claude-code). This removes the last place the no-flag path assumed OpenAI; the engine already supported openai/anthropic/google/claude-code/zai for both text and vision.
+
+- [#124](https://github.com/tangle-network/browser-agent-driver/pull/124) [`a2055b2`](https://github.com/tangle-network/browser-agent-driver/commit/a2055b2e7a7e68726ceb8b8c5cdaf92ca3215b06) Thanks [@drewstone](https://github.com/drewstone)! - design-audit (reference-grounded): make the redesign engine job-first instead of aesthetic-first. The old engine grounded every page in a world-class exemplar's visual DNA and judged on visual craft, so it regressed functional pages into generic brochures — a docs page lost its table-of-contents and dense reference content for two marketing cards and a hero; an aggregator dropped from 30 items to 9; a status dashboard shed services into spacious cards. The fix:
+
+  - **Generator** (`reference/generate/prompt.ts`): persona reframed from art director to product designer. New hard rules in priority order — task-first (design for the page's users and the job in its intent) → preserve functional affordances (never delete navigation/ToC/search to look cleaner) → preserve density where it is the value (docs/dashboards/feeds keep their item count) → right-size the intervention (never turn one kind of page into another) → the exemplar is a source of visual craft only, never a structural template.
+  - **Functional contract**: a per-page preservation block derived from the page's own measured DNA (navigation-affordance count, layout density, archetype) so "keep what works" is concrete and data-driven, not exhortation — and density is required only when the page is actually measured dense, so a genuinely sparse page is never forced to stay dense.
+  - **Ranker/judge** (`reference/judge/prompt.ts`): scores task fitness and functional preservation BEFORE visual craft; a polished direction that removes navigation or reduces density loses. "Fit to the reference" counts only as visual craft.
+
+  Validated by re-running the regressed pages: docs now keeps its ToC + prev/next nav + dense code examples; HN keeps all 30 stories + nav; the status dashboard stays a dense service grid with real values. No provider coupling; flag-gated reference engine only.
+
+### Patch Changes
+
+- [#123](https://github.com/tangle-network/browser-agent-driver/pull/123) [`20942c2`](https://github.com/tangle-network/browser-agent-driver/commit/20942c2a4160d876537cbde3ec72f5f4559cb703) Thanks [@drewstone](https://github.com/drewstone)! - design-audit (reference-grounded): enforce content fidelity so a redesign never fabricates content the page lacks. On a content-sparse page grounded against a dense exemplar, the generator would invent factual content to fill the layout (e.g. a placeholder page gaining a fake "Recent Activity" feed with timestamps, invented status/RFC/registry data), and the pairwise direction-ranker rewarded that invented density as "richer" — so applied to a real app the audit could inject fabricated data into the UI. Now the generator may restyle/regroup/re-rank only the page's real content (the exemplar governs how it looks, never what content it has; a sparse page stays proportionally restrained), the ranker penalises invented content as unfaithful instead of rewarding it, and the apply prompt carries a defense-in-depth "do not invent content" guardrail. No provider coupling.
+
+- [#120](https://github.com/tangle-network/browser-agent-driver/pull/120) [`f11b899`](https://github.com/tangle-network/browser-agent-driver/commit/f11b89971cccfcc4c083c0fe958d918caf030568) Thanks [@drewstone](https://github.com/drewstone)! - design-audit (reference-grounded): make redesign generation work with reasoning models. The generator capped output at 2200 tokens, which a reasoning model (e.g. GLM-5.2, o-series) spends on its thinking before the answer — so the JSON direction came back empty or truncated and the audit fell back with a misleading "no JSON object found". Raise the per-direction budget to 8000 (non-reasoning models stop at the closing brace and never use the extra, so it's free for them), and report empty vs truncated vs non-JSON output distinctly so a budget/limit issue is diagnosable. No coupling to any one provider — the engine already runs on openai/anthropic/google/claude-code/zai.
+
 ## 0.34.0
 
 ### Minor Changes

diff --git a/package.json b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@tangle-network/browser-agent-driver",
-  "version": "0.34.0",
+  "version": "0.35.0",
   "description": "LLM-driven browser agent and bad CLI for UI automation, testing, and evaluation",
   "publishConfig": {
     "access": "public"