InternLM · YanhuiDua · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026 · Jun 16, 2026
diff --git a/docs/superpowers/plans/2026-06-09-producer-trace-validation.md b/docs/superpowers/plans/2026-06-09-producer-trace-validation.md
diff --git a/docs/superpowers/plans/2026-06-09-unified-trace-viewer.md b/docs/superpowers/plans/2026-06-09-unified-trace-viewer.md
@@ -0,0 +1,370 @@
+# Unified Trace Viewer Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Merge the current online producer trace viewer and offline hotspot viewer into one unified viewer that defaults to all tasks, shows overview + stage stats + task list + task detail, and uses one shared payload model for live and offline modes.
+
+**Architecture:** Move viewer analysis toward a single shared payload builder in `producer_trace_analysis.py`, then make both `producer_trace_viewer.py` and `producer_trace_hotspots.py` render the same page model. Keep live/offline differences in data loading only. Preserve `latest batch` as a client-visible filter option, but make `all tasks` the default semantic view.
+
+**Tech Stack:** Python dataclasses, existing `TraceEvent` JSONL shards, current producer trace analysis helpers, static HTML + inline JavaScript, `unittest`.
+
+---
+
+## File Structure
+
+- `xtuner/tools/producer_trace_analysis.py`
+  - Shared analysis layer for unified task rows, stage stats, per-task chart rows, and per-scope payloads.
+- `xtuner/tools/producer_trace_viewer.py`
+  - Live viewer server + offline snapshot entrypoint for the unified page.
+- `xtuner/tools/producer_trace_hotspots.py`
+  - Becomes a thin compatibility wrapper around the unified offline page builder.
+- `xtuner/v1/rl/trace.py`
+  - Update `TraceConfig.viewer_scope` default if unified viewer should default to all tasks.
+- `tests/rl/test_trace.py`
+  - Unified viewer payload tests, default scope tests, and compatibility tests.
+- `docs/superpowers/specs/2026-06-09-trace-next-phase-working-notes.md`
+  - Keep requirement decisions synchronized as implementation proceeds.
+
+## Task 1: Add Unified Analysis Payload
+
+**Files:**
+- Modify: `xtuner/tools/producer_trace_analysis.py`
+- Test: `tests/rl/test_trace.py`
+
+- [ ] **Step 1: Add the failing tests for unified summary semantics**
+
+Add tests that assert:
+
+- overview uses all tasks by default
+- stage summary exposes:
+  - `running_tasks`
+  - `visited_tasks`
+  - `avg_s`
+  - `p95_s`
+  - `max_s`
+- task detail data includes both text timeline events and graphical spans
+- failed tasks are counted separately
+
+Sketch:
+
+```python
+def test_unified_view_payload_reports_overview_stage_stats_and_task_detail(self):
+    payload = build_unified_trace_payload_from_events(events, trace_source="/tmp/trace")
+    self.assertEqual(payload["default_scope"], "all")
+    self.assertEqual(payload["views"]["all"]["overview"]["total_tasks"], 3)
+    self.assertEqual(payload["views"]["all"]["overview"]["failed_tasks"], 1)
+    stage = payload["views"]["all"]["stage_stats"][0]
+    self.assertIn("running_tasks", stage)
+    self.assertIn("visited_tasks", stage)
+    detail = payload["views"]["all"]["task_details"]["gsm8k:1"]
+    self.assertTrue(detail["timeline_events"])
+    self.assertTrue(detail["timeline_spans"])
+```
+
+- [ ] **Step 2: Run the targeted test to confirm it fails**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_view_payload_reports_overview_stage_stats_and_task_detail
+```
+
+Expected:
+
+- FAIL because `build_unified_trace_payload_from_events` or equivalent fields do not exist yet.
+
+- [ ] **Step 3: Add shared dataclasses / payload builders in `producer_trace_analysis.py`**
+
+Implement shared analysis primitives instead of keeping viewer/hotspot summaries separate:
+
+- enhanced task row
+- per-stage stats
+- per-task detail payload
+- scope-aware top-level payload
+
+The new payload should conceptually look like:
+
+```python
+{
+    "default_scope": "all",
+    "available_scopes": ["all", "latest-produce-batch"],
+    "views": {
+        "all": {
+            "overview": {...},
+            "stage_stats": [...],
+            "task_rows": [...],
+            "task_details": {...},
+        },
+        "latest-produce-batch": {...},
+    },
+}
+```
+
+- [ ] **Step 4: Re-run the targeted unified payload test**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_view_payload_reports_overview_stage_stats_and_task_detail
+```
+
+Expected:
+
+- PASS
+
+## Task 2: Replace Separate Viewer/Hotspot Pages With One Unified Page
+
+**Files:**
+- Modify: `xtuner/tools/producer_trace_viewer.py`
+- Modify: `xtuner/tools/producer_trace_hotspots.py`
+- Test: `tests/rl/test_trace.py`
+
+- [ ] **Step 1: Add failing tests for unified page structure**
+
+Add assertions that rendered HTML contains the new sections and no longer contains removed sections:
+
+```python
+def test_unified_viewer_html_contains_new_sections(self):
+    html = render_unified_trace_html(payload, live=False)
+    self.assertIn("Total tasks", html)
+    self.assertIn("Failed", html)
+    self.assertIn("Stage", html)
+    self.assertIn("Task Timeline", html)
+    self.assertNotIn("Suspect Open Spans", html)
+    self.assertNotIn("Latest Stage Distribution", html)
+```
+
+- [ ] **Step 2: Run the targeted HTML test to confirm it fails**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_viewer_html_contains_new_sections
+```
+
+Expected:
+
+- FAIL because the old HTML still renders the old layout.
+
+- [ ] **Step 3: Implement the unified HTML / JS page in `producer_trace_viewer.py`**
+
+Refactor page structure to:
+
+- header
+- overview cards
+- scope toggle
+- stage summary table
+- task list with filters
+- task detail:
+  - text timeline
+  - chart timeline below
+
+The JS should:
+
+- switch between `all` and `latest-produce-batch`
+- filter task rows by:
+  - state
+  - current stage
+  - search text
+- render task detail for the selected row
+
+- [ ] **Step 4: Convert `producer_trace_hotspots.py` into a compatibility entrypoint**
+
+Make the offline hotspots script reuse the unified offline page builder instead of maintaining a separate page model.
+
+Compatibility behavior:
+
+- existing CLI entry still works
+- output HTML is the unified viewer page
+- offline mode loads static payload only
+
+- [ ] **Step 5: Re-run the targeted HTML test**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_viewer_html_contains_new_sections
+```
+
+Expected:
+
+- PASS
+
+## Task 3: Flip Default Viewer Semantics to All Tasks
+
+**Files:**
+- Modify: `xtuner/v1/rl/trace.py`
+- Modify: `xtuner/tools/producer_trace_viewer.py`
+- Modify: `xtuner/tools/producer_trace_hotspots.py`
+- Test: `tests/rl/test_trace.py`
+
+- [ ] **Step 1: Add failing tests for default scope**
+
+Add tests that assert:
+
+- `TraceConfig.viewer_scope` defaults to `"all"`
+- CLI default scope for unified viewer is `"all"`
+- live payload chooses `all` as `default_scope`
+
+Sketch:
+
+```python
+def test_trace_config_defaults_viewer_scope_to_all(self):
+    self.assertEqual(TraceConfig().viewer_scope, "all")
+```
+
+- [ ] **Step 2: Run the targeted default-scope tests**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_trace_config_defaults_viewer_scope_to_all
+```
+
+Expected:
+
+- FAIL because current default is `latest-produce-batch`.
+
+- [ ] **Step 3: Change default viewer scope to `all`**
+
+Update:
+
+- `TraceConfig.viewer_scope`
+- CLI defaults for unified viewer/offline page entrypoints
+- any tests or assumptions that still rely on `latest-produce-batch` as the default
+
+- [ ] **Step 4: Keep `latest-produce-batch` as an optional filter**
+
+Do not remove the capability. Keep it available in:
+
+- payload `available_scopes`
+- UI scope selector
+- offline CLI option
+
+- [ ] **Step 5: Re-run the default-scope tests**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_trace_config_defaults_viewer_scope_to_all
+```
+
+Expected:
+
+- PASS
+
+## Task 4: Add Viewer Tests for Failed Tasks and Task Detail Behavior
+
+**Files:**
+- Modify: `tests/rl/test_trace.py`
+
+- [ ] **Step 1: Add failing tests for failed-task accounting**
+
+Add tests that assert:
+
+- `failed_tasks` is counted in overview
+- failed tasks appear in `task_rows`
+- `error_msg` appears in task detail only
+
+Sketch:
+
+```python
+def test_unified_viewer_counts_failed_tasks_and_keeps_error_msg_in_task_detail(self):
+    payload = build_unified_trace_payload_from_events(events, trace_source="/tmp/trace")
+    overview = payload["views"]["all"]["overview"]
+    self.assertEqual(overview["failed_tasks"], 1)
+    row = next(row for row in payload["views"]["all"]["task_rows"] if row["trace_id"] == "gsm8k:9")
+    self.assertEqual(row["status"], "failed")
+    detail = payload["views"]["all"]["task_details"]["gsm8k:9"]
+    self.assertIn("trace smoke judger failure", json.dumps(detail, ensure_ascii=False))
+    self.assertNotIn("error_msg", row)
+```
+
+- [ ] **Step 2: Run the targeted failed-task test**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_viewer_counts_failed_tasks_and_keeps_error_msg_in_task_detail
+```
+
+Expected:
+
+- FAIL until failed-task handling and detail structure are correct.
+
+- [ ] **Step 3: Finish the analysis/payload wiring for failed tasks**
+
+Make sure:
+
+- overview counts failed tasks
+- task rows expose status and current stage
+- task detail contains full event records including `error_msg`
+- task rows do not duplicate the full `error_msg`
+
+- [ ] **Step 4: Re-run the targeted failed-task test**
+
+Run:
+
+```bash
+python -m unittest tests.rl.test_trace.TraceStoreAndViewerTest.test_unified_viewer_counts_failed_tasks_and_keeps_error_msg_in_task_detail
+```
+
+Expected:
+
+- PASS
+
+## Task 5: Full Verification
+
+**Files:**
+- Verify touched files only
+
+- [ ] **Step 1: Run unified trace tests**
+
+Run:
+
+```bash
+python -m unittest discover -s tests/rl -p test_trace.py
+```
+
+Expected:
+
+- PASS
+
+- [ ] **Step 2: Run compile checks**
+
+Run:
+
+```bash
+python -m compileall -q xtuner/tools/producer_trace_analysis.py xtuner/tools/producer_trace_viewer.py xtuner/tools/producer_trace_hotspots.py xtuner/v1/rl/trace.py tests/rl/test_trace.py
+```
+
+Expected:
+
+- PASS
+
+- [ ] **Step 3: Run diff sanity**
+
+Run:
+
+```bash
+git diff --check
+```
+
+Expected:
+
+- no whitespace / merge-marker issues
+
+- [ ] **Step 4: Optional live smoke after unit verification**
+
+Run:
+
+```bash
+bash -x examples/v1/scripts/run_rl.sh examples/v1/config/testing/rl_trace_smoke_enabled.py lmdeploy "$MODEL_PATH" "$DATA_PATH" "$EVAL_DATA_PATH"
+```
+
+Expected:
+
+- unified live viewer starts
+- page defaults to all tasks
+- scope selector can switch to latest batch
+