feat(assets): add namespaced model_type tags and align tag semantics#14511
feat(assets): add namespaced model_type tags and align tag semantics#14511synap5e wants to merge 2 commits into
Conversation
4a757fc to
4340337
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4340337c69
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR makes asset tags case-sensitive throughout the stack. A migration removes the lowercase-only 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests-unit/assets_test/test_prune_orphaned_assets.py (1)
32-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope is no longer applied in
find_asset, making these assertions cross-test ambiguous.Line 32 drops scope-based narrowing, but callers still pass
scopeand use generic names (input.bin/output.bin). That can match unrelated assets and produce false pass/fail outcomes.Suggested hardening
def find_asset(http: requests.Session, api_base: str): """Query API for assets matching scope and optional name.""" - def _find(scope: str, name: str | None = None) -> list[dict]: + def _find(_scope: str, name: str | None = None) -> list[dict]: params = {"limit": "500"} if name: params["name_contains"] = name @@ def test_prune_across_multiple_roots( @@ scope = f"multi-{uuid.uuid4().hex[:6]}" - input_fp = create_seed_file("input", scope, "input.bin") - create_seed_file("output", scope, "output.bin") + input_name = f"{scope}-input.bin" + output_name = f"{scope}-output.bin" + input_fp = create_seed_file("input", scope, input_name) + create_seed_file("output", scope, output_name) trigger_sync_seed_assets(http, api_base) - assert find_asset(scope, input_fp.name) - assert find_asset(scope, "output.bin") + assert find_asset(scope, input_name) + assert find_asset(scope, output_name) @@ - assert not find_asset(scope, input_fp.name) - assert find_asset(scope, "output.bin") + assert not find_asset(scope, input_name) + assert find_asset(scope, output_name)Also applies to: 115-122
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests-unit/assets_test/test_prune_orphaned_assets.py` around lines 32 - 40, The find_asset function no longer applies scope-based filtering when retrieving assets, which causes test assertions to be ambiguous when multiple tests use generic asset names like input.bin or output.bin. Modify the find_asset function to accept and apply scope as a query parameter in the API request (similar to how name_contains is currently handled), ensuring that the returned assets are filtered by both scope and name to prevent false matches across different test scopes.
🧹 Nitpick comments (1)
alembic_db/versions/0005_allow_case_sensitive_tags.py (1)
23-33: ⚡ Quick winSQLite foreign key handling may leave FKs disabled on error.
If any statement between
PRAGMA foreign_keys=OFF(line 23) andPRAGMA foreign_keys=ON(line 33) raises an exception, foreign keys remain disabled for the connection. Consider wrapping in a try/finally or using a transaction to ensure cleanup.🛡️ Suggested safer pattern
if bind.dialect.name == "sqlite": - op.execute("PRAGMA foreign_keys=OFF") - op.execute( - "CREATE TABLE tags_new (" - "name VARCHAR(512) NOT NULL, " - "CONSTRAINT pk_tags PRIMARY KEY (name)" - ")" - ) - op.execute("INSERT INTO tags_new(name) SELECT name FROM tags") - op.execute("DROP TABLE tags") - op.execute("ALTER TABLE tags_new RENAME TO tags") - op.execute("PRAGMA foreign_keys=ON") - return + try: + op.execute("PRAGMA foreign_keys=OFF") + op.execute( + "CREATE TABLE tags_new (" + "name VARCHAR(512) NOT NULL, " + "CONSTRAINT pk_tags PRIMARY KEY (name)" + ")" + ) + op.execute("INSERT INTO tags_new(name) SELECT name FROM tags") + op.execute("DROP TABLE tags") + op.execute("ALTER TABLE tags_new RENAME TO tags") + finally: + op.execute("PRAGMA foreign_keys=ON") + return🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@alembic_db/versions/0005_allow_case_sensitive_tags.py` around lines 23 - 33, The migration code disables foreign keys at the start with `PRAGMA foreign_keys=OFF` but if any of the subsequent `op.execute()` calls (such as CREATE TABLE tags_new, INSERT INTO tags_new, DROP TABLE tags, or ALTER TABLE tags_new RENAME) raise an exception, the final `PRAGMA foreign_keys=ON` statement will not execute, leaving foreign keys permanently disabled for the connection. Wrap the middle statements (the table creation, data insertion, and table swap operations) in a try/finally block to ensure that `PRAGMA foreign_keys=ON` is always executed regardless of whether an error occurs during the migration steps.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@alembic_db/versions/0005_allow_case_sensitive_tags.py`:
- Around line 42-51: The downgrade path uses SQLite-specific SQL syntax that
will fail on PostgreSQL and MySQL databases. The operations in the downgrade
function including INSERT OR IGNORE and rowid references are not
database-agnostic. Wrap these SQLite-specific op.execute calls (the INSERT OR
IGNORE statement and the DELETE with rowid usage) in a conditional check for
SQLite dialect similar to the dialect check pattern used elsewhere in the
migration file, or rewrite the operations using portable SQL syntax that handles
both INSERT OR IGNORE (for SQLite), INSERT ... ON CONFLICT DO NOTHING (for
PostgreSQL), and INSERT IGNORE (for MySQL), and replace rowid references with
database-appropriate alternatives like ctid for PostgreSQL or proper primary key
handling.
---
Outside diff comments:
In `@tests-unit/assets_test/test_prune_orphaned_assets.py`:
- Around line 32-40: The find_asset function no longer applies scope-based
filtering when retrieving assets, which causes test assertions to be ambiguous
when multiple tests use generic asset names like input.bin or output.bin. Modify
the find_asset function to accept and apply scope as a query parameter in the
API request (similar to how name_contains is currently handled), ensuring that
the returned assets are filtered by both scope and name to prevent false matches
across different test scopes.
---
Nitpick comments:
In `@alembic_db/versions/0005_allow_case_sensitive_tags.py`:
- Around line 23-33: The migration code disables foreign keys at the start with
`PRAGMA foreign_keys=OFF` but if any of the subsequent `op.execute()` calls
(such as CREATE TABLE tags_new, INSERT INTO tags_new, DROP TABLE tags, or ALTER
TABLE tags_new RENAME) raise an exception, the final `PRAGMA foreign_keys=ON`
statement will not execute, leaving foreign keys permanently disabled for the
connection. Wrap the middle statements (the table creation, data insertion, and
table swap operations) in a try/finally block to ensure that `PRAGMA
foreign_keys=ON` is always executed regardless of whether an error occurs during
the migration steps.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c1776315-8449-45c8-acf4-3f40bb5ffd5b
📒 Files selected for processing (26)
alembic_db/versions/0005_allow_case_sensitive_tags.pyapp/assets/api/routes.pyapp/assets/api/schemas_in.pyapp/assets/api/upload.pyapp/assets/database/queries/tags.pyapp/assets/helpers.pyapp/assets/services/ingest.pyapp/assets/services/path_utils.pycomfy_api/feature_flags.pyopenapi.yamlserver.pytests-unit/assets_test/conftest.pytests-unit/assets_test/queries/test_asset_info.pytests-unit/assets_test/queries/test_tags.pytests-unit/assets_test/services/test_path_utils.pytests-unit/assets_test/test_assets_missing_sync.pytests-unit/assets_test/test_crud.pytests-unit/assets_test/test_downloads.pytests-unit/assets_test/test_list_cursor.pytests-unit/assets_test/test_list_filter.pytests-unit/assets_test/test_metadata_filters.pytests-unit/assets_test/test_prune_orphaned_assets.pytests-unit/assets_test/test_tags_api.pytests-unit/assets_test/test_uploads.pytests-unit/feature_flags_test.pytests-unit/websocket_feature_flags_test.py
cb67c0e to
45cf933
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/assets/services/ingest.py`:
- Around line 105-109: The tag persistence flow in ingest should separate
caller-provided tags from backend-derived ones: in the path that uses
get_path_derived_tags_from_path(locator) and adds the injected "uploaded" tag,
keep user labels with tag_origin="manual" but persist backend-generated tags
(including input/output/models/model_type:* and uploaded) with a backend/system
origin instead. Update the write path around normalize_tags and the related tag
creation logic so the origin is assigned based on whether the tag came from the
caller or from trusted server-side state, and apply the same split in the other
affected tag-persisting block as well.
In `@openapi.yaml`:
- Line 1639: The multipart upload description in the API contract still mentions
the hash-only flow, which does not belong on the POST /api/assets upload path
because that endpoint requires file content. Update the description text in
openapi.yaml where the asset upload schema is documented to remove
hash-only/deduplicated reference creation wording, and keep that guidance only
on the /api/assets/from-hash contract so the multipart upload behavior matches
the endpoint’s actual inputs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 8410f72b-69a8-4865-b6ec-48c181086096
📒 Files selected for processing (9)
app/assets/api/routes.pyapp/assets/api/schemas_in.pyapp/assets/api/upload.pyapp/assets/services/ingest.pyapp/assets/services/path_utils.pyopenapi.yamlserver.pytests-unit/assets_test/services/test_path_utils.pytests-unit/assets_test/test_uploads.py
💤 Files with no reviewable changes (3)
- app/assets/api/upload.py
- app/assets/api/schemas_in.py
- app/assets/api/routes.py
🚧 Files skipped from review as they are similar to previous changes (1)
- server.py
45cf933 to
df91e4b
Compare
Amp-Thread-ID: https://ampcode.com/threads/T-019ecf39-2e6f-747d-ae80-addba6b8e4f5 Co-authored-by: Amp <amp@ampcode.com>
662b09f to
44b3239
Compare
Amp-Thread-ID: https://ampcode.com/threads/T-019ecf39-2e6f-747d-ae80-addba6b8e4f5 Co-authored-by: Amp <amp@ampcode.com>
|
@guill per feedback I updated the PR:
Also:
|
This PR reworks asset tags so they are no longer overloaded as model classification, upload routing, and filesystem subdirectory instructions at the same time.
It introduces namespaced
model_type:<folder_name>tags for filesystem-backed model classification, preserves caller-provided tags as relaxed labels, and changes multipart asset upload routing so extra tags no longer create path components.Note: This PR does not migrate assets tagged by the previous version of the asset system. Existing flat tags such as
checkpointsremain as labels. A migration/backfill is possible as a follow-up if we want one.What this does
models + model_type:<folder_name>instead of flat labels such asmodels + checkpointsormodels + loras.input,output, ormodels). Model uploads additionally require exactly onemodel_type:<folder_name>tag. Extra tags are labels only and do not create subdirectories.file_pathanddisplay_namestorage locator fields for filesystem-backed assets./upload/imagecontinues to use its existingtypeandsubfolderform fields for storage and response compatibility. Only exact known image subfolder values (pasted,painter,webcam,threed,3d) are mirrored as asset tags.The backend-generated model classification shape is:
models: the asset has a real filesystem path under at least one allowed registered ComfyUI model folder.model_type:<folder_name>: the asset path is under the registered model folder named<folder_name>, preserving the registered folder name casing.Examples:
models + model_type:checkpoints,models + model_type:loras,models + model_type:LLM.Backend-generated classification is added only from trusted filesystem facts, such as a scanner-discovered path, an already-written
/upload/imagefile, or the final destination of a multipart byte upload. Tag filters still operate on the single persisted tag set, so once a tag exists it is filterable regardless of whether it originated from a caller or from backend classification.Motivation
The old asset tag behavior was not just underspecified; it produced wrong or misleading results for real ComfyUI model configurations.
folder_paths.folder_names_and_paths, and a single model type can have registered paths outside the stock<models>/<folder>layout. The old path-derived tags treated the registered folder name and parent path fragments as ordinary flat tags such ascheckpoints,loras, or arbitrary subfolder names. That meant the API did not have a stable way to say “this asset belongs to the registeredcheckpointsmodel type” versus “this asset merely has a caller/path label namedcheckpoints.” Namespaced tags make the model type explicit:models + model_type:<folder_name>.folder_names_and_pathsalso contains entries that are not safe model upload targets. In particular,custom_nodesandconfigsshould not become places where asset upload requests can write user bytes just because they are registered folder names. This PR excludes those names from model classification/upload destination resolution.output + models + model_type:*when both root and model-folder membership are true.models + checkpoints + foo + barboth a classifier and a filesystem placement instruction. This PR keeps routing narrow: new byte uploads use exactly one destination role (input,models, oroutput), model uploads use exactly onemodel_type:<folder_name>, and all other tags remain labels.This PR also changes location semantics for new multipart
/api/assetsbyte uploads. Previously, ordered tags described the destination root, model category, and nested filesystem subdirectories: for example,input + foowrote underinput/foo, andmodels + checkpoints + foo + barwrote under a checkpoints folder plusfoo/bar. In this PR, multipart upload location is selected only by the destination role and optionalmodel_type:<folder_name>for model uploads. Extra tags no longer create path components, and this PR does not add a/api/assetssubfolderfield.That location change is specific to multipart
/api/assetsbyte uploads./upload/imagekeeps its existing placement semantics: it still usestypeandsubfolderform fields to choose where the image is written. Known/upload/imagesubfolders such aspasted,painter,webcam,threed, and3dare mirrored as asset tags when the already-written file is registered.Examples
File under a registered checkpoint folder:
models,model_type:checkpointsfile_path: storage locator such asmodels/checkpoints/foo.safetensorsoroutput/checkpoints/foo.safetensorsinclude_tags=models,model_type:checkpointsFile under a case-preserving registered model folder named
LLM:models,model_type:LLMmodel_type:LLMandmodel_type:llmare distinct tagsMultipart byte upload with tags
models + model_type:checkpoints:uploaded,models, andmodel_type:checkpointsMultipart byte upload with tags
input + model_type:checkpoints:model_type:checkpointsas a labelMultipart byte upload with tags
models + model_type:checkpoints + foo + bar:fooandbaras labelsfoo/barsubdirectories/api/assets/from-hashwith tagsmodels + model_type:checkpoints:/upload/imagewithtype=input&subfolder=pasted:input/pasted/<filename>using the existing image-upload fieldsinput,uploaded, and the knownpastedtag/upload/imagewithtype=input&subfolder=custom/session:input/custom/session/<filename>input + uploadedcustomorsessiontagsChanges
tags, with a downgrade path that lowercases/merges mixed-case tags before restoring the old constraint.inputoutputtempmodelsmodel_type:<folder_name>input/pasted,input/painter,input/webcam,input/threed, andinput/3d.checkpoints,loras, or arbitrary parent directory names from model paths.input,models, oroutput.model_type:<folder_name>tag for new multipart model uploads./api/assetsfree of a newsubfolderfield; nested multipart placement is not part of this PR.configsandcustom_nodesas model upload destinations.uploadedto successful byte upload registrations, including already-written/upload/imageregistrations./upload/imagesubfolders (pasted,painter,webcam,threed,3d) as tags when registering those files as assets.uploaded, or copy path-derived classification.file_path/display_nameresponse locator fields.supports_model_type_tags: truein server feature flags for frontend capability detection./api/assetsschema behavior, and/upload/imagesubfolder compatibility.Behavior / compatibility notes
modelstag matchesinclude_tags=modelsregardless of whether it was supplied by a caller or generated from a trusted path.model_type:<folder_name>to a filesystem-backed model asset, that follow-up moves/re-registers the file so the label and loader-visible location stay coherent.output/foo/bar.pngis classified asoutput, notoutput + foo. Direct files under known image-input directories are the narrow exception, e.g.input/pasted/foo.pnggetsinput + pasted./api/assetssubfolderplacement field./upload/imageplacement is not changed: it continues to use itstypeandsubfolderform fields rather than asset tags. Known image-upload subfolder names are also stored as tags for filtering.checkpointsare not migrated intomodel_type:checkpoints; clients that need transition compatibility should handle both shapes.output + models + model_type:*; persisted tags are not dynamically recomputed at response time.Verification
uv run ruff check app/assets/api/schemas_in.py app/assets/api/upload.py app/assets/api/routes.py app/assets/services/ingest.py app/assets/services/path_utils.py server.py tests-unit/assets_test/test_uploads.py tests-unit/assets_test/services/test_path_utils.pyuv run --with-requirements requirements.txt --with-requirements tests-unit/requirements.txt pytest tests-unit/assets_test/services/test_path_utils.py tests-unit/assets_test/test_uploads.py -q71 passeduv run --with pyyaml python - <<'PY' ...file_path/display_nameand update responses referenceAsset