Skip to content

Security: plosiewicz/feature-tracker

Security

SECURITY.md

Security conventions

Slice 1.5 hardened the project against the failure mode that surfaced during Slice 1, where the operator's shell OPENAI_API_KEY was rendered verbatim in a pytest assertion-diff. The defenses below codify how we keep credentials out of test output, repr, logs, and tracked files. Read this before touching any code that handles secrets.

1. Secret typing convention

Every credential field on Settings is typed SecretStr | None, not str | None. Pydantic's SecretStr:

  • Renders as '**********' in repr, str, model_dump, and model_dump_json outputs.
  • Requires an explicit .get_secret_value() call to recover the real value, which is grep-able and audit-able.

Currently typed as SecretStr:

  • arango_password
  • openai_api_key
  • notebooklm_bl

When adding a new credential field, type it SecretStr | None. The tests/test_config.py::test_secrets_are_secretstr_typed test pins this convention; reverting any of the above to plain str makes the suite fail loudly.

At the point of use, call .get_secret_value() only as late as possible:

openai_client = OpenAI(api_key=settings.openai_api_key.get_secret_value())

2. model_dump round-trip foot-gun

Settings.model_dump_json() produces {"openai_api_key": "**********"}. Feeding that JSON back into Settings.model_validate_json() will load SecretStr("**********") into the field — the mask string becomes the "real" value, and the model silently corrupts.

If a future feature serializes Settings (debug endpoint, snapshot tool, configuration export), it MUST exclude secret fields:

safe = settings.model_dump(exclude={"openai_api_key", "arango_password", "notebooklm_bl"})

Do not round-trip the masked form.

3. Test isolation

tests/conftest.py defines a session-scope autouse fixture that:

  1. Asserts the get_settings() lru_cache was empty at session start. A non-zero pre-clear cache size means some module called get_settings() at import time — before this fixture had a chance to scrub the env — which would let unscrubbed values silently leak into tests. If this assertion fires, fix the caller (lazy access: call get_settings() inside a function, not at module scope), not the fixture.
  2. Scrubs every env var that maps to a SecretStr-typed Settings field. Discovery is programmatic — adding a new SecretStr field automatically extends coverage.
  3. Clears the get_settings() lru_cache.

Do not bypass this fixture. Real-credential integration tests (Slice 4+) must opt in via a separate marker (@pytest.mark.integration) and a dedicated fixture that restores values from a controlled source.

4. detect-secrets scanner

A detect-secrets pre-commit hook + CI step protects the second vector (secrets pasted into tracked files).

  • Baseline: .secrets.baseline — generated against git ls-files only, so untracked scratchpads cannot inadvertently get whitelisted. Empty on first generation.
  • Pre-commit: runs on git commit. Blocks if any staged file contains a high-entropy or pattern-matched secret not in the baseline.
  • CI: same hook runs in .github/workflows/ci.yml against tracked files. Same blocking behaviour.
  • Local: make security-scan runs the same check on demand.

When a synthetic-but-realistic test fixture trips the entropy heuristic (e.g., "sk-test-NEVER-PRINT-ME"), use the inline marker:

canary = "sk-test-NEVER-PRINT-ME"  # pragma: allowlist secret

DO NOT "fix" the finding by regenerating the baseline; that hides real future secrets. Baseline edits must be PR-reviewed.

5. Known residuals (not defended)

Documented explicitly so we don't drift into over-confidence:

  • F-string interpolation foot-gun. f"Bearer {settings.openai_api_key}" produces "Bearer **********". This is a correctness bug (the auth call fails), not a leak — but the type system cannot prevent it. Always .get_secret_value() when building auth headers / connection strings. Integration tests catch this when they actually run.
  • detect-secrets failure messages echo line context. If a real secret is accidentally added to a tracked file and the CI hook fires, the matched substring appears in the GitHub Actions log. Mitigation is upstream — Layers 1 and 2 prevent real secrets from reaching tracked files in the first place.
  • Cursor / LLM session boundary. This subplan defends file-on-disk, stdout, and logs. It does NOT defend against pasting a secret into a Cursor chat prompt; treat the LLM session as semi-trusted retention.
  • Sentry / error-reporting future-proofing. If Sentry is added in a later slice, its local-variable capture would include SecretStr objects whose __repr__ is already masked. A future Sentry config change must not silently break this assumption.
  • set -x in shell scripts. Forbidden in any script that touches gcloud secrets versions add or pipes credentials. The current scripts/bootstrap_gcp.sh is safe (set -euo pipefail, no -x).

6. Adding a new secret

Checklist when introducing a new credential:

  1. Add the field to Settings typed SecretStr | None.
  2. Add the field name to SECRET_FIELDS in tests/test_config.py.
  3. Add a Secret Manager stub in scripts/bootstrap_gcp.sh and an env var entry in .env.example (placeholder value only).
  4. At every call site, access via .get_secret_value() — never via string interpolation.
  5. Re-run make check && make security-scan from a shell where the real secret IS exported; both must pass with no leakage.

There aren't any published security advisories