Skip to content

Staging pt_br TN to main#421

Open
mgrafu wants to merge 5 commits intomainfrom
staging/pt-br_tn
Open

Staging pt_br TN to main#421
mgrafu wants to merge 5 commits intomainfrom
staging/pt-br_tn

Conversation

@mgrafu
Copy link
Copy Markdown
Collaborator

@mgrafu mgrafu commented May 1, 2026

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Before your PR is "Ready for review"

Pre checks:

  • Have you signed your commits? Use git commit -s to sign.
  • Do all unittests finish successfully before sending PR?
    1. pytest or (if your machine does not have GPU) pytest --cpu from the root folder (given you marked your test cases accordingly @pytest.mark.run_only_on('CPU')).
    2. Sparrowhawk tests bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...
  • If you are adding a new feature: Have you added test cases for both pytest and Sparrowhawk here.
  • Have you added __init__.py for every folder and subfolder, including data folder which has .TSV files?
  • Have you followed codeQL results and removed unused variables and imports (report is at the bottom of the PR in github review box) ?
  • Have you added the correct license header Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. to all newly added Python files?
  • If you copied nemo_text_processing/text_normalization/en/graph_utils.py your header's second line should be Copyright 2015 and onwards Google, Inc.. See an example here.
  • Remove import guards (try import: ... except: ...) if not already done.
  • If you added a new language or a new feature please update the NeMo documentation (lives in different repo).
  • Have you added your language support to tools/text_processing_deployment/pynini_export.py.

PR Type:

  • New Feature
  • Bugfix
  • Documentation
  • Test

If you haven't finished some of the above items you can still open "Draft" PR.

folivoramanh and others added 5 commits April 15, 2026 12:56
…raction (#403)

* Add Portuguese (PT) text normalization: cardinal, ordinal, decimal, fraction

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Mai Anh <palasek182@gmail.com>

* date and time semiotic classese

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Mai Anh <palasek182@gmail.com>

* update sh files

Signed-off-by: Mai Anh <palasek182@gmail.com>

* Update Portuguese text normalization tutorial with enhanced examples and outputs

- Changed the language parameter in the Normalizer instance from 'en' to 'pt'.
- Added detailed output examples for the normalizer's methods, including documentation for `__doc__` and `normalize()`.
- Updated example input string to reflect a more complex Portuguese sentence for normalization.
- Adjusted execution counts for code cells to ensure proper order of execution.

This update aims to improve the clarity and usability of the tutorial for Portuguese text normalization.

Signed-off-by: Mai Anh <palasek182@gmail.com>

* remove current unuse file

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update minor update and punct

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mai Anh <palasek182@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* PT TN: money, measure, telephone, electronic

Adds semiotic classes and tests on top of staging/pt-br_tn; includes
cardinal fix for X00 + 01–09 and Sparrowhawk script updates.

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix bugs based on review cardinal, fraction, money, measure

Signed-off-by: Mai Anh <palasek182@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* modify test case time

Signed-off-by: Mai Anh <palasek182@gmail.com>

* modify with mariana's review

Signed-off-by: Mai Anh <palasek182@gmail.com>

---------

Signed-off-by: Mai Anh <palasek182@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com>
Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants