Skip to content

Fix US program statistics variable mappings#327

Open
anth-volk wants to merge 7 commits intomainfrom
fix-us-program-statistics
Open

Fix US program statistics variable mappings#327
anth-volk wants to merge 7 commits intomainfrom
fix-us-program-statistics

Conversation

@anth-volk
Copy link
Copy Markdown
Contributor

@anth-volk anth-volk commented Apr 30, 2026

Fixes #325

This draft PR addresses only the immediate causes of the US economic_impact_analysis program-statistics breakage observed while following up on Vahid's JOSS PR (#264). It is not intended to solve the broader design problem of country package program variables changing over time; that follow-up is tracked in #326.

Changes:

  • Keep ProgramStatistics.program_name as the direct model/output variable to aggregate; no separate alias field is introduced.
  • Replace the broken US program-statistics mappings with currently valid policyengine-us variables:
    • payroll_tax -> employee_payroll_tax
    • medicare -> medicare_cost
    • state_income_tax remains state_income_tax
  • Note: payroll_tax was configured as a person-level program, while employee_payroll_tax is a tax-unit-level variable. This means the payroll-tax program-statistics row now reports tax-unit recipient/winner/loser counts, using tax-unit weights, rather than person counts.
  • Add medicare_cost and state_income_tax to the US output variables materialized by .py.
  • Update US household snapshots for the newly exposed medicare_cost and state_income_tax output keys.
  • Validate US program-statistics configuration before expensive simulations run.
  • Add shared error helpers for constructing typed errors and conditional error detail lines, then use them in the US program-statistics validation error path.
  • Replace bare aggregate next(...) lookup failures with descriptive ValueErrors.
  • Add mocked US program-statistics, aggregate error, and error-helper tests that would have caught this issue.

Verification:

  • make format
  • uv run --python 3.13 --extra dev python -m pytest tests/test_household_calculator_snapshot.py::test_us_household_snapshot tests/test_errors.py tests/test_aggregate.py tests/test_change_aggregate.py tests/test_us_program_statistics.py
  • uv run --python 3.13 --extra dev pytest tests/test_change_aggregate.py tests/test_us_program_statistics.py
  • uv run --python 3.13 --extra dev ruff check tests/test_us_program_statistics.py tests/test_change_aggregate.py
  • git diff --check

@vahid-ahmadi
Copy link
Copy Markdown

Review

Core logic is correct — variable name swaps, validation, and StopIterationValueError upgrade all check out. Two things to fix before merge.

Blocking

  1. CI failures are caused by this PR. Adding medicare_cost to entity_variables["person"] makes the household calculator materialize it, and 4 snapshots in test_household_calculator_snapshot.py fail with new key: person[0].medicare_cost=14500.0. Need to regenerate the snapshot files.

  2. Negative test can silently pass on the wrong exception. test_us_program_statistics_config_fails_before_simulation_run uses try/except + manual raise AssertionError. Use pytest.raises(ValueError, match="US program statistics config is invalid") instead.

Worth addressing

  1. ChangeAggregate has no invalid-variable regression test. test_aggregate.py got one for Aggregate, but ChangeAggregate had the same bare next(...) pattern and isn't covered.

  2. Entity change not flagged in the PR description. payroll_tax was entity="person"; employee_payroll_tax is entity="tax_unit". Correct, but it changes recipient counts in the program-statistics table — worth noting.

Verified correct

Minor suggestions

  • context= strings like "Aggregate.variable" are internal field names; user-facing wording reads better.
  • Missing return annotations on the three new helpers (-> None, -> Variable).
  • Validator dedupes errors across baseline/reform — asymmetric configs harder to triage.

@anth-volk anth-volk marked this pull request as ready for review May 1, 2026 14:13
@anth-volk anth-volk requested a review from vahid-ahmadi May 1, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

US economic_impact_analysis uses fragile hard-coded program variable names

2 participants