Skip to content

Clean up YAML files: lowercase booleans, trailing whitespace, and yamllint integration#3923

Open
igorts-git wants to merge 1 commit into
mainfrom
igorts/clean_up_yml_files
Open

Clean up YAML files: lowercase booleans, trailing whitespace, and yamllint integration#3923
igorts-git wants to merge 1 commit into
mainfrom
igorts/clean_up_yml_files

Conversation

@igorts-git
Copy link
Copy Markdown
Collaborator

Description

Overview

This pull request establishes rigorous YAML validation by integrating yamllint into pre-commit and standardizing boolean syntax across all configuration files in the MaxText repository.
All changes are structural syntax and whitespace adjustments. No configuration parameters, runtime values, or comments were altered.

Key Changes

  • YAML 1.2 Boolean Standardization: Replaced capitalized True and False boolean values with standard lowercase true and false across more than 100 model configurations, inference profiles, and GitHub workflows.
  • Automated Whitespace Cleanup: Stripped trailing whitespaces and ensured single EOF newlines across all .yml files (including .github/workflows/).
  • Linter Integration: Added yamllint to .pre-commit-config.yaml alongside a customized .yamllint root configuration to enforce syntax correctness on future commits.

In the .yamllint file we explicitly turned off several styling rules to ensure CI tests don't fail over minor formatting preferences:

  • line-length: Allows lines, comments, and URLs to exceed 80 characters.
  • comments: Disables the strict requirement of exactly two spaces before inline comments (key: val # comment).
  • indentation: Permits flexible indentation depth (e.g., 2 vs 4 spaces in multiline dictionaries or lists).
  • commas: Allows compact inline lists without mandatory spaces after commas (e.g., ["silu","linear"]).
  • colons: Allows extra spaces after colons (which is common when aligning dictionary values in columns).
  • empty-lines: Allows multiple consecutive blank lines for visual separation.

What Remains active:

  • syntax: Instantly catches invalid YAML structure, duplicate dictionary keys, and unclosed quotes.
  • truthy: Enforces modern YAML 1.2 booleans (requiring lowercase true and false, and flagging ambiguous values like True, False, yes, no).
  • trailing-spaces: Flags invisible trailing whitespace at the ends of lines.
  • new-line-at-end-of-file: Ensures files end cleanly with a standard single EOF newline.

Tests

CI tests

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@igorts-git igorts-git force-pushed the igorts/clean_up_yml_files branch from b5632d2 to 6fb3f85 Compare May 15, 2026 23:16
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown

🤖 Hi @igorts-git, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This PR successfully standardizes YAML files across the repository by enforcing YAML 1.2 boolean syntax (lowercase true/false), removing trailing whitespace, and integrating yamllint into the pre-commit pipeline. These changes improve codebase consistency and prevent syntax errors in configuration files.

🔍 General Feedback

  • Consistency in Comments: While the PR correctly updates the boolean values in the code, many comments still refer to True or False. Consider a follow-up pass to update these comments for full consistency with the new convention.
  • Permissive Linter Config: The initial .yamllint configuration is quite permissive (disabling rules like line-length and indentation). This is a pragmatic choice for a large-scale initial cleanup, but it might be beneficial to gradually re-enable some of these rules with sensible defaults in the future.
  • Workflow Formatting: The cleanup of whitespace in GitHub Actions workflows is a good practice that improves maintainability and readability.

log_config: True # Prints the config (after defaults have been set by pyconfig logic)
debug_sharding: False # Prints model weights sharding info
log_config: true # Prints the config (after defaults have been set by pyconfig logic)
debug_sharding: false # Prints model weights sharding info
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 For consistency with the new YAML boolean convention, consider updating the comments to use lowercase `true` and `false` as well.
Suggested change
debug_sharding: false # Prints model weights sharding info
profile_power_events: false # Set to true to enable TPU-specific power/thermal profiling events. Defaults to false to avoid breaking GPU xplane tracing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@igorts-git igorts-git force-pushed the igorts/clean_up_yml_files branch from 6fb3f85 to b33ec50 Compare May 16, 2026 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant