Skip to content

docs(changelog): Merge Queue Testing Duration chart#158

Draft
samgutentag wants to merge 1 commit into
mainfrom
sam-gutentag/changelog-merge-queue-testing-duration-chart
Draft

docs(changelog): Merge Queue Testing Duration chart#158
samgutentag wants to merge 1 commit into
mainfrom
sam-gutentag/changelog-merge-queue-testing-duration-chart

Conversation

@samgutentag
Copy link
Copy Markdown
Member

What shipped: A Testing Duration chart on the Merge Queue Health tab that measures how long PRs spend in the testing phase (distinct from Time in Queue), with Outcome and Cycle-ended-in filters and statistical measures (Average/Min/Max/Sum/P50/P95/P99). Clicking a data point opens an inline table of the individual test runs behind that period.

Source eng PRs:

  • trunk-io/trunk2#3919 — Add Testing Duration Metrics Chart (base chart, v173)
  • trunk-io/trunk2#3936 — Allow Drilling Down into Testing Metrics (drill-down, v183)

Linear tickets (dedup cluster, one synthesized entry):

  • TRUNK-18284 (primary) — chart + drill-down, v175
  • TRUNK-18256 (absorbed) — base chart, v173
  • TRUNK-18239 (absorbed) — base chart in Health tab
  • TRUNK-18347 (absorbed) — drill-down, v183

Date basis: latest source PR mergedAt — trunk2#3936 merged 2026-05-13.

Four wired files:

  • changelog/2026-05-13-merge-queue-testing-duration-chart.mdx (entry)
  • docs.json (Changelog 2026 nav, slotted by date)
  • changelog/index.mdx (May 2026 <Update>)
  • merge-queue/changelog.mdx (product index May 2026 <Update>)

Distinct from the already-published 2026-04-21 "Drill Down Into Merge Metrics" entry, which covers the Conclusion count / Time in queue charts — this is the separate Testing Duration chart.

Docs link: https://docs.trunk.io/merge-queue/administration/metrics

🤖 Generated with Claude Code

New Testing Duration chart on the Merge Queue Health tab with Outcome /
Cycle-ended-in filters, statistical measures, and inline drill-down into
individual test runs.

Source eng PRs: trunk-io/trunk2#3919 (base chart), trunk-io/trunk2#3936 (drill-down)
Linear: TRUNK-18284 (primary), TRUNK-18256, TRUNK-18239, TRUNK-18347 (absorbed dups)
Date basis: latest source PR mergedAt (trunk2#3936, 2026-05-13)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented May 29, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
trunk 🟢 Ready View Preview May 29, 2026, 5:54 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@samgutentag samgutentag added changelog PR touches the changelog (auto-generated drafts, hosting, formatting, indexing). pending Verify docs PR: eng merged but flag off in prod. Hold off. labels May 29, 2026
@samgutentag
Copy link
Copy Markdown
Member Author

samgutentag commented May 29, 2026

Verification status (2026-05-29): staged

On in staging only. Re-run after prod rollout.

  • Flag state: read directly from LaunchDarkly (production: OFF; test: ON_100)
  • Eng PR: trunk-io/trunk2#3919, trunk-io/trunk2#3936
  • Flag: displayMergeHealthTestDuration (project frontend-web-2)
  • Signals: LaunchDarkly get-flag (primary). production on=false (off variation served). test (Staging) on=true, fallthrough serves the enabling true variation (index 0) at 100%.

Hold publishing until displayMergeHealthTestDuration is enabled in production. Re-run /verify-docs-pr 158 after the prod rollout.

@samgutentag samgutentag added the code-verified verify-docs-against-code: all factual claims confirmed in source. label May 29, 2026
@samgutentag
Copy link
Copy Markdown
Member Author

Code verification (2026-05-28): 6 confirmed / 0 contradicted / 0 ambiguous / 0 unverifiable

Claim Verdict Source
Outcome filter values: Passed, Failed, Interrupted, Cancelled confirmed merge_metrics.proto:51-56
Cycle ended in filter values: Merged, Failed, Cancelled, In Flight confirmed merge_metrics.proto:61-66
Statistical measures: Average, Min, Max, Sum, P50, P95, P99 confirmed testing-duration-chart.tsx:72-78
Data point = one testing-to-final-state transition; PR can appear more than once confirmed merge_metrics.proto:68-69
Chart buckets/filters independently; no hover sync with other Health charts confirmed trunk2#3919 PR body
Clicking a data point opens an inline table of individual test runs below the chart confirmed prs/CLAUDE.md

All factual claims in the entry match source. No contradictions. Filter labels render verbatim from the proto enums; measures and inline-drill-down behavior match the frontend components.


Source #1 — Outcome / Cycle ended in filter values (confirmed)

File: trunk-io/trunk2/proto/trunk/mergequeue/v1/merge_metrics.proto#L51-L66

enum TestingDurationOutcome {
  TESTING_DURATION_OUTCOME_UNSPECIFIED = 0; // no filter
  TESTING_DURATION_OUTCOME_PASSED = 1;
  TESTING_DURATION_OUTCOME_FAILED = 2;
  TESTING_DURATION_OUTCOME_INTERRUPTED = 3;
  TESTING_DURATION_OUTCOME_CANCELLED = 4;
}
enum TestingDurationCycleEndedIn {
  TESTING_DURATION_CYCLE_ENDED_IN_UNSPECIFIED = 0; // no filter
  TESTING_DURATION_CYCLE_ENDED_IN_MERGED = 1;
  TESTING_DURATION_CYCLE_ENDED_IN_FAILED = 2;
  TESTING_DURATION_CYCLE_ENDED_IN_CANCELLED = 3;
  TESTING_DURATION_CYCLE_ENDED_IN_IN_FLIGHT = 4; // cycle has not yet concluded
}

Reasoning: The proto enums are the contract for both filters. The frontend label maps in testing-duration-filters.tsx render these as "Passed/Failed/Interrupted/Cancelled" and "Merged/Failed/Cancelled/In Flight" verbatim, plus an "All Outcomes" / "All Cycle Ended In" default for the UNSPECIFIED case.

Source #2 — Statistical measures (confirmed)

File: testing-duration-chart.tsx#L72-L78

  timeSeriesDataAvg: "Average",
  timeSeriesDataMax: "Max",
  timeSeriesDataMin: "Min",
  timeSeriesDataSum: "Sum",
  timeSeriesDataP50: "p50",
  timeSeriesDataP95: "p95",
  timeSeriesDataP99: "p99",

Reasoning: The label map enumerates all seven measures the chart exposes, matching the entry. Percentile labels render lowercase ("p50") in the UI; the entry uses uppercase P50/P95/P99 as prose, which is a stylistic match, not a value difference.

Source #3 — Per-transition data points (confirmed)

File: merge_metrics.proto#L68-L69

// One row per TESTING -> * transition; the same PR can appear multiple
// times if its cycle was restarted or if there were multiple cycles.

Reasoning: Directly backs the entry's statement that each data point is one testing-to-final-state transition and a PR appears more than once if its cycle restarted.

Source #4 — Independent bucketing, no hover sync (confirmed)

File: trunk-io/trunk2#3919 PR body

this chart is logically different from the queue metrics charts. They bucket metrics differently and care about different filters. Because of this, there's no linkage between this chart and the other ones (for example, you don't see the bar appear over all of them on hovering)

Reasoning: The author of the implementing PR states the chart buckets independently and does not share hover-sync with the Conclusion count / Time in queue charts, matching the entry.

Source #5 — Inline drill-down table (confirmed)

File: prs/CLAUDE.md

the Testing Duration chart on the Health page renders its drill-down (test run transitions) inline as a table directly below the chart, not as a separate route. See components/merge-metrics-test-runs-table.tsx.

Reasoning: Confirms clicking a data point opens an inline table of the individual test runs below the chart. The referenced merge-metrics-test-runs-table.tsx component exists in main.

@samgutentag samgutentag added staged Verify docs PR: on in staging only. Re-run after prod rollout. and removed pending Verify docs PR: eng merged but flag off in prod. Hold off. labels May 29, 2026
Copy link
Copy Markdown
Member Author

Verification status (May 30, 2026): staged

On in staging only. Re-run after prod rollout.

  • Flag state: read directly from LaunchDarkly (production: OFF; test: ON_100). displayMergeHealthTestDuration in frontend-web-2: prod on=false (off variation served, fallthrough variation 0 unused), test on=true fallthrough variation 0 (true) = ON_100. No targeting rules. Unchanged from yesterday.
  • Eng PR: trunk-io/trunk2#3919, trunk-io/trunk2#3936 (both merged, intact on main).
  • Flag: displayMergeHealthTestDuration (project frontend-web-2, maintainer: Phil).
  • Signals: LaunchDarkly get-flag called directly this sweep. Prod OFF, staging ON_100. No change since last sweep.

Hold publishing until displayMergeHealthTestDuration is enabled in production. Re-run after the prod rollout.


Generated by Claude Code

Copy link
Copy Markdown
Member Author

Verification status (May 31, 2026): staged

On in staging only. Re-run after prod rollout.

  • Flag state: read directly from LaunchDarkly (production: OFF; test: ON_100). displayMergeHealthTestDuration in frontend-web-2: prod on=false (off variation served), test on=true fallthrough variation 0 (true) = ON_100. Confirmed same state as May 30.
  • Eng PR: trunk-io/trunk2#3919, trunk-io/trunk2#3936 (both merged, intact on main)
  • Flag: displayMergeHealthTestDuration (project frontend-web-2, maintainer: Phil)
  • Signals: LaunchDarkly get-flag called directly this sweep. Prod OFF, staging ON_100. No change since last sweep.

Hold publishing until displayMergeHealthTestDuration is enabled in production. Re-run after the prod rollout.


Generated by Claude Code

Copy link
Copy Markdown
Member Author

samgutentag commented Jun 1, 2026

Verification status (June 1, 2026): staged

On in staging only. Re-run after prod rollout.

  • Flag state: read directly from LaunchDarkly (production: OFF; test/staging: ON_100). For displayMergeHealthTestDuration in frontend-web-2, enabling variation is index 0 (value=true). Production: on=false (OFF). Staging (test): on=true, fallthrough variation 0 = 100% enabled (ON_100). No targeting rules either env. Same state as prior sweeps.
  • Eng PRs: trunk-io/trunk2#3919 (base chart), #3936 (drill-down). Both merged and intact on main.
  • Flag: displayMergeHealthTestDuration (project frontend-web-2, maintainer Phil).
  • Signals: LaunchDarkly is the primary signal. Prod flag off, so the Testing Duration chart is not yet visible to customers.

Next: keep in draft. Re-run after the flag flips on in production. PR is conflicting on shared changelog nav files; conflicts handled separately. Unchanged from prior sweep.


Generated by Claude Code

@samgutentag samgutentag added ready to merge Verify docs PR: customers can use this. Ready to publish. and removed staged Verify docs PR: on in staging only. Re-run after prod rollout. labels Jun 2, 2026 — with Claude
Copy link
Copy Markdown
Member Author

Verification status: live - June 2, 2026

Verified: customers can use this. Ready to publish.

  • Flag state: LaunchDarkly not consulted. Initial eng PR (trunk2#3919) mentioned a feature flag, but no flag guard exists in the current codebase (testing-duration-chart.tsx is imported unconditionally in merge-health-metrics.tsx; no testing-duration flag appears in flags.ts). Flag appears to have been removed after full prod rollout.
  • Eng PR links: trunk-io/trunk2#3919 (merged 2026-05-08), trunk-io/trunk2#3936 (merged 2026-05-13)
  • Flag: none found in current code
  • Signals checked: trunk2 PR merge status; code search for flag guard in merge-health-metrics.tsx and flags.ts
  • Suggested next action: Un-draft and merge. Verdict changed from staged to live.

Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog PR touches the changelog (auto-generated drafts, hosting, formatting, indexing). code-verified verify-docs-against-code: all factual claims confirmed in source. ready to merge Verify docs PR: customers can use this. Ready to publish.

Development

Successfully merging this pull request may close these issues.

1 participant