feature/SOF-7886 update: Add surface energy NB#299
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 42 minutes and 53 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughNew Jupyter notebook adds an end-to-end Surface Energy DFT workflow for Mat3ra: browser authentication, slab loading, bulk resolution with multi-step fallbacks (IDs/hashes/recompute), optional bulk creation, check/use of refined bulk total_energy, workflow assembly, job submission, polling, and result visualization. (50 words) Changes
Sequence Diagram(s)sequenceDiagram
participant Browser as Browser/Client
participant APIClient as APIClient
participant Platform as Mat3ra Platform
participant Compute as Compute Cluster
participant MaterialsDB as Materials DB
Browser->>Browser: Install JupyterLite deps (if emscripten)
Browser->>APIClient: authenticate()
APIClient->>Platform: OAuth / token exchange
Platform-->>APIClient: Auth token
Browser->>APIClient: select account & project
APIClient->>Platform: query accounts/projects
Platform-->>APIClient: account/project list
Browser->>Browser: load slab from uploads or query Platform
Browser->>MaterialsDB: extract slab metadata (bulk/crystal ref)
Browser->>Browser: resolve exact bulk (id → scaledHash → hash → recompute)
Browser->>Platform: verify or create exact bulk
Platform-->>Browser: bulk record (existing or created)
Browser->>Platform: query refined-properties for bulk total_energy (qe:*)
Platform-->>Browser: refined total_energy or none
alt refined total_energy missing & no auto-run
Browser->>Browser: raise error / instruct to run Total Energy notebook
else refined total_energy available
Browser->>Browser: load Surface Energy workflow (Standata)
Browser->>Browser: apply QE edits (optional)
Browser->>APIClient: save workflow & create compute config
Browser->>APIClient: create surface job (slab + workflow)
APIClient->>Compute: submit job
Compute->>Compute: run calculations
Compute-->>Platform: store surface_energy and slab total_energy properties
Platform-->>Browser: job results
end
Browser->>Browser: visualize slab and surface_energy
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
other/materials_designer/workflows/surface_energy.ipynb (3)
527-528: Nit: missing blank line betweenbuild_qe_modelandapply_qe_settings_to_workflow.PEP 8 expects two blank lines between top-level function defs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 527 - 528, The top-level function definitions lack the required separation: insert an extra blank line between the end of build_qe_model and the start of apply_qe_settings_to_workflow so there are two blank lines between these top-level functions (ensure build_qe_model and apply_qe_settings_to_workflow are separated by the standard PEP8 two-line gap).
587-596: Duplicated cluster-selection logic between bulk and surface jobs.The cluster lookup in lines 587–592 is identical to the one in cell 37 (lines 717–722). Extract a small helper (e.g.
pick_cluster(clusters, name)) and reuse, so the two paths cannot drift apart.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 587 - 596, The cluster-selection logic that builds bulk_cluster from bulk_clusters using CLUSTER_NAME is duplicated; extract a helper function (e.g., pick_cluster(clusters, name)) that encapsulates the branching (if name -> find by hostname substring else -> first cluster) and return None if not found, then replace the duplicated blocks in both the bulk path (where bulk_clusters and bulk_cluster are used to create Compute(cluster=bulk_cluster, ...)) and the surface path with calls to pick_cluster; ensure callers handle a None return or raise a clear error and keep references to Compute, CLUSTER_NAME, QUEUE_NAME, and PPN unchanged.
552-561: Minor: cell preservesbulk_workflow/saved_bulk_workflowonly whenbulk_total_energy_property is None, but later code assumes they're set in the same conditional branch — keep them strictly co-located.Currently
saved_bulk_workflow = Noneis initialized in the next cell (line 583) instead of here, whilebulk_workflow = Noneis set here. Initialization is split across two cells, which is fragile if cells are re-run out of order. Consider initializing both in the same cell where they are first introduced (this cell) to make execution-order assumptions explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 552 - 561, The code initializes bulk_workflow here but leaves saved_bulk_workflow initialized in a later cell, which makes re-running cells fragile; initialize saved_bulk_workflow alongside bulk_workflow in this same cell (i.e., set saved_bulk_workflow = None where bulk_workflow = None and only create/assign both inside the bulk_total_energy_property is None branch) so both variables are co-located and their lifetimes are explicit when using bulk_total_energy_property, bulk_workflow, and saved_bulk_workflow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@other/materials_designer/workflows/surface_energy.ipynb`:
- Around line 819-832: The final summary unconditionally indexes
bulk_total_energy_property['data']['value'], which will raise if
bulk_total_energy_property is None; update the final prints to guard that access
(e.g., check if bulk_total_energy_property is truthy before referencing
['data']['value'] or print a safe fallback like "N/A"), or move the exact bulk
total energy print next to the earlier validation that already ensures
bulk_total_energy_property is non-None; ensure you reference the
bulk_total_energy_property variable and keep the other prints unchanged.
- Around line 747-756: The create_job call is using the local unsaved
surface_workflow instead of the saved platform workflow; replace the workflow
argument in create_job from surface_workflow to saved_surface_workflow so the
job uses the persisted workflow (with its platform _id). Specifically update the
call that constructs surface_job_response (which currently passes
workflow=surface_workflow) to pass workflow=saved_surface_workflow while leaving
other args (materials=[saved_slab], project_id, owner_id, prefix, compute)
unchanged.
- Around line 428-444: The print logic after get_or_create_material is
misleading when UPDATE_SLAB_BULK_ID is False: if current_bulk_id !=
bulk_material_response["_id"] the code should report that an update was
suppressed rather than "already points". Modify the branch around
UPDATE_SLAB_BULK_ID/current_bulk_id to first check equality (current_bulk_id ==
bulk_material_response["_id"]) and print the "already up-to-date" message only
in that case; if they differ and UPDATE_SLAB_BULK_ID is False, print a clear
"update suppressed" message referencing current_bulk_id and
bulk_material_response["_id"]; keep the existing
client.materials.update/saved_slab_response refresh behavior when
UPDATE_SLAB_BULK_ID is True.
- Around line 598-611: The bulk job is being created with the local unsaved
workflow (bulk_workflow) instead of the persisted workflow returned by
get_or_create_workflow; update the create_job call to pass the saved workflow
(saved_bulk_workflow) or its identifier (saved_bulk_workflow.id) in place of
workflow=bulk_workflow so the job references the platform-saved Workflow; also
scan for other create_job or job submission calls in this flow (e.g.,
bulk_job_response = create_job(...)) and replace any remaining bulk_workflow
usages with saved_bulk_workflow to ensure the saved instance is used
consistently.
---
Nitpick comments:
In `@other/materials_designer/workflows/surface_energy.ipynb`:
- Around line 527-528: The top-level function definitions lack the required
separation: insert an extra blank line between the end of build_qe_model and the
start of apply_qe_settings_to_workflow so there are two blank lines between
these top-level functions (ensure build_qe_model and
apply_qe_settings_to_workflow are separated by the standard PEP8 two-line gap).
- Around line 587-596: The cluster-selection logic that builds bulk_cluster from
bulk_clusters using CLUSTER_NAME is duplicated; extract a helper function (e.g.,
pick_cluster(clusters, name)) that encapsulates the branching (if name -> find
by hostname substring else -> first cluster) and return None if not found, then
replace the duplicated blocks in both the bulk path (where bulk_clusters and
bulk_cluster are used to create Compute(cluster=bulk_cluster, ...)) and the
surface path with calls to pick_cluster; ensure callers handle a None return or
raise a clear error and keep references to Compute, CLUSTER_NAME, QUEUE_NAME,
and PPN unchanged.
- Around line 552-561: The code initializes bulk_workflow here but leaves
saved_bulk_workflow initialized in a later cell, which makes re-running cells
fragile; initialize saved_bulk_workflow alongside bulk_workflow in this same
cell (i.e., set saved_bulk_workflow = None where bulk_workflow = None and only
create/assign both inside the bulk_total_energy_property is None branch) so both
variables are co-located and their lifetimes are explicit when using
bulk_total_energy_property, bulk_workflow, and saved_bulk_workflow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5e0daed5-e010-4593-962f-c3f6bc491d27
📒 Files selected for processing (1)
other/materials_designer/workflows/surface_energy.ipynb
| "saved_slab_response = get_or_create_material(client, slab, ACCOUNT_ID)\n", | ||
| "\n", | ||
| "current_bulk_id = (saved_slab_response.get(\"metadata\") or {}).get(\"bulkId\")\n", | ||
| "if UPDATE_SLAB_BULK_ID and current_bulk_id != bulk_material_response[\"_id\"]:\n", | ||
| " client.materials.update(\n", | ||
| " saved_slab_response[\"_id\"],\n", | ||
| " {\"$set\": {\"metadata.bulkId\": bulk_material_response[\"_id\"]}},\n", | ||
| " )\n", | ||
| " saved_slab_response = client.materials.get(saved_slab_response[\"_id\"])\n", | ||
| " print(\n", | ||
| " f\"✅ Updated slab metadata.bulkId: {saved_slab_response['_id']} -> \" \n", | ||
| " f\"{saved_slab_response.get('metadata', {}).get('bulkId')}\"\n", | ||
| " )\n", | ||
| "else:\n", | ||
| " print(f\"♻️ Slab already points to bulkId={current_bulk_id}\")\n", | ||
| "\n", | ||
| "saved_slab = Material.create(saved_slab_response)\n" |
There was a problem hiding this comment.
UPDATE_SLAB_BULK_ID=False path still calls get_or_create_material and prints a misleading message.
When UPDATE_SLAB_BULK_ID=False but current_bulk_id already differs from the resolved bulk, the user sees "♻️ Slab already points to bulkId=<old>" even though it does not point to the bulk this notebook is about to use. Consider distinguishing the two cases: (a) update suppressed by flag, (b) already up-to-date.
🛠 Suggested fix
-else:
- print(f\"♻️ Slab already points to bulkId={current_bulk_id}\")
+elif current_bulk_id == bulk_material_response[\"_id\"]:
+ print(f\"♻️ Slab already points to bulkId={current_bulk_id}\")
+else:
+ print(
+ f\"⚠️ UPDATE_SLAB_BULK_ID is False; slab.metadata.bulkId={current_bulk_id} \"
+ f\"differs from exact bulk {bulk_material_response['_id']}\"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "saved_slab_response = get_or_create_material(client, slab, ACCOUNT_ID)\n", | |
| "\n", | |
| "current_bulk_id = (saved_slab_response.get(\"metadata\") or {}).get(\"bulkId\")\n", | |
| "if UPDATE_SLAB_BULK_ID and current_bulk_id != bulk_material_response[\"_id\"]:\n", | |
| " client.materials.update(\n", | |
| " saved_slab_response[\"_id\"],\n", | |
| " {\"$set\": {\"metadata.bulkId\": bulk_material_response[\"_id\"]}},\n", | |
| " )\n", | |
| " saved_slab_response = client.materials.get(saved_slab_response[\"_id\"])\n", | |
| " print(\n", | |
| " f\"✅ Updated slab metadata.bulkId: {saved_slab_response['_id']} -> \" \n", | |
| " f\"{saved_slab_response.get('metadata', {}).get('bulkId')}\"\n", | |
| " )\n", | |
| "else:\n", | |
| " print(f\"♻️ Slab already points to bulkId={current_bulk_id}\")\n", | |
| "\n", | |
| "saved_slab = Material.create(saved_slab_response)\n" | |
| "saved_slab_response = get_or_create_material(client, slab, ACCOUNT_ID)\n", | |
| "\n", | |
| "current_bulk_id = (saved_slab_response.get(\"metadata\") or {}).get(\"bulkId\")\n", | |
| "if UPDATE_SLAB_BULK_ID and current_bulk_id != bulk_material_response[\"_id\"]:\n", | |
| " client.materials.update(\n", | |
| " saved_slab_response[\"_id\"],\n", | |
| " {\"$set\": {\"metadata.bulkId\": bulk_material_response[\"_id\"]}},\n", | |
| " )\n", | |
| " saved_slab_response = client.materials.get(saved_slab_response[\"_id\"])\n", | |
| " print(\n", | |
| " f\"✅ Updated slab metadata.bulkId: {saved_slab_response['_id']} -> \" \n", | |
| " f\"{saved_slab_response.get('metadata', {}).get('bulkId')}\"\n", | |
| " )\n", | |
| "elif current_bulk_id == bulk_material_response[\"_id\"]:\n", | |
| " print(f\"♻️ Slab already points to bulkId={current_bulk_id}\")\n", | |
| "else:\n", | |
| " print(\n", | |
| " f\"⚠️ UPDATE_SLAB_BULK_ID is False; slab.metadata.bulkId={current_bulk_id} \"\n", | |
| " f\"differs from exact bulk {bulk_material_response['_id']}\"\n", | |
| " )\n", | |
| "\n", | |
| "saved_slab = Material.create(saved_slab_response)\n" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 428 -
444, The print logic after get_or_create_material is misleading when
UPDATE_SLAB_BULK_ID is False: if current_bulk_id !=
bulk_material_response["_id"] the code should report that an update was
suppressed rather than "already points". Modify the branch around
UPDATE_SLAB_BULK_ID/current_bulk_id to first check equality (current_bulk_id ==
bulk_material_response["_id"]) and print the "already up-to-date" message only
in that case; if they differ and UPDATE_SLAB_BULK_ID is False, print a clear
"update suppressed" message referencing current_bulk_id and
bulk_material_response["_id"]; keep the existing
client.materials.update/saved_slab_response refresh behavior when
UPDATE_SLAB_BULK_ID is True.
| " saved_bulk_workflow_response = get_or_create_workflow(client, bulk_workflow, ACCOUNT_ID)\n", | ||
| " saved_bulk_workflow = Workflow.create(saved_bulk_workflow_response)\n", | ||
| " print(f\"Bulk workflow ID: {saved_bulk_workflow.id}\")\n", | ||
| "\n", | ||
| " bulk_job_name = f\"{MY_BULK_TOTAL_ENERGY_WORKFLOW_NAME} {bulk_material.formula} {timestamp}\"\n", | ||
| " bulk_job_response = create_job(\n", | ||
| " api_client=client,\n", | ||
| " materials=[bulk_material],\n", | ||
| " workflow=bulk_workflow,\n", | ||
| " project_id=project_id,\n", | ||
| " owner_id=ACCOUNT_ID,\n", | ||
| " prefix=bulk_job_name,\n", | ||
| " compute=bulk_compute.to_dict(),\n", | ||
| " )\n", |
There was a problem hiding this comment.
Bulk job is created with the unsaved local workflow instead of the saved one.
get_or_create_workflow returns the platform-saved workflow, which you wrap as saved_bulk_workflow (with a real _id), but create_job is then called with workflow=bulk_workflow — the locally-built instance that hasn't been persisted. Depending on how api_client.jobs.create handles a workflow payload without _id, this can either fail outright or silently create/submit a job whose workflow reference is not the one you saved (and that other users/dashboards can see). Use the saved instance.
🛠 Suggested fix
- saved_bulk_workflow_response = get_or_create_workflow(client, bulk_workflow, ACCOUNT_ID)
- saved_bulk_workflow = Workflow.create(saved_bulk_workflow_response)
- print(f\"Bulk workflow ID: {saved_bulk_workflow.id}\")
-
- bulk_job_name = f\"{MY_BULK_TOTAL_ENERGY_WORKFLOW_NAME} {bulk_material.formula} {timestamp}\"
- bulk_job_response = create_job(
- api_client=client,
- materials=[bulk_material],
- workflow=bulk_workflow,
+ saved_bulk_workflow_response = get_or_create_workflow(client, bulk_workflow, ACCOUNT_ID)
+ saved_bulk_workflow = Workflow.create(saved_bulk_workflow_response)
+ print(f\"Bulk workflow ID: {saved_bulk_workflow.id}\")
+
+ bulk_job_name = f\"{MY_BULK_TOTAL_ENERGY_WORKFLOW_NAME} {bulk_material.formula} {timestamp}\"
+ bulk_job_response = create_job(
+ api_client=client,
+ materials=[bulk_material],
+ workflow=saved_bulk_workflow,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| " saved_bulk_workflow_response = get_or_create_workflow(client, bulk_workflow, ACCOUNT_ID)\n", | |
| " saved_bulk_workflow = Workflow.create(saved_bulk_workflow_response)\n", | |
| " print(f\"Bulk workflow ID: {saved_bulk_workflow.id}\")\n", | |
| "\n", | |
| " bulk_job_name = f\"{MY_BULK_TOTAL_ENERGY_WORKFLOW_NAME} {bulk_material.formula} {timestamp}\"\n", | |
| " bulk_job_response = create_job(\n", | |
| " api_client=client,\n", | |
| " materials=[bulk_material],\n", | |
| " workflow=bulk_workflow,\n", | |
| " project_id=project_id,\n", | |
| " owner_id=ACCOUNT_ID,\n", | |
| " prefix=bulk_job_name,\n", | |
| " compute=bulk_compute.to_dict(),\n", | |
| " )\n", | |
| " saved_bulk_workflow_response = get_or_create_workflow(client, bulk_workflow, ACCOUNT_ID)\n", | |
| " saved_bulk_workflow = Workflow.create(saved_bulk_workflow_response)\n", | |
| " print(f\"Bulk workflow ID: {saved_bulk_workflow.id}\")\n", | |
| "\n", | |
| " bulk_job_name = f\"{MY_BULK_TOTAL_ENERGY_WORKFLOW_NAME} {bulk_material.formula} {timestamp}\"\n", | |
| " bulk_job_response = create_job(\n", | |
| " api_client=client,\n", | |
| " materials=[bulk_material],\n", | |
| " workflow=saved_bulk_workflow,\n", | |
| " project_id=project_id,\n", | |
| " owner_id=ACCOUNT_ID,\n", | |
| " prefix=bulk_job_name,\n", | |
| " compute=bulk_compute.to_dict(),\n", | |
| " )\n", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 598 -
611, The bulk job is being created with the local unsaved workflow
(bulk_workflow) instead of the persisted workflow returned by
get_or_create_workflow; update the create_job call to pass the saved workflow
(saved_bulk_workflow) or its identifier (saved_bulk_workflow.id) in place of
workflow=bulk_workflow so the job references the platform-saved Workflow; also
scan for other create_job or job submission calls in this flow (e.g.,
bulk_job_response = create_job(...)) and replace any remaining bulk_workflow
usages with saved_bulk_workflow to ensure the saved instance is used
consistently.
| "surface_job_name = f\"{MY_SURFACE_WORKFLOW_NAME} {saved_slab.formula} {timestamp}\"\n", | ||
| "surface_job_response = create_job(\n", | ||
| " api_client=client,\n", | ||
| " materials=[saved_slab],\n", | ||
| " workflow=surface_workflow,\n", | ||
| " project_id=project_id,\n", | ||
| " owner_id=ACCOUNT_ID,\n", | ||
| " prefix=surface_job_name,\n", | ||
| " compute=compute.to_dict(),\n", | ||
| ")\n", |
There was a problem hiding this comment.
Surface job has the same issue: pass saved_surface_workflow, not surface_workflow.
saved_surface_workflow is created in cell 33 but never used; create_job here is invoked with the unsaved local surface_workflow, which won't carry the platform _id. Mirror the bulk-job fix.
🛠 Suggested fix
surface_job_response = create_job(
api_client=client,
materials=[saved_slab],
- workflow=surface_workflow,
+ workflow=saved_surface_workflow,
project_id=project_id,
owner_id=ACCOUNT_ID,
prefix=surface_job_name,
compute=compute.to_dict(),
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "surface_job_name = f\"{MY_SURFACE_WORKFLOW_NAME} {saved_slab.formula} {timestamp}\"\n", | |
| "surface_job_response = create_job(\n", | |
| " api_client=client,\n", | |
| " materials=[saved_slab],\n", | |
| " workflow=surface_workflow,\n", | |
| " project_id=project_id,\n", | |
| " owner_id=ACCOUNT_ID,\n", | |
| " prefix=surface_job_name,\n", | |
| " compute=compute.to_dict(),\n", | |
| ")\n", | |
| "surface_job_name = f\"{MY_SURFACE_WORKFLOW_NAME} {saved_slab.formula} {timestamp}\"\n", | |
| "surface_job_response = create_job(\n", | |
| " api_client=client,\n", | |
| " materials=[saved_slab],\n", | |
| " workflow=saved_surface_workflow,\n", | |
| " project_id=project_id,\n", | |
| " owner_id=ACCOUNT_ID,\n", | |
| " prefix=surface_job_name,\n", | |
| " compute=compute.to_dict(),\n", | |
| ")\n", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 747 -
756, The create_job call is using the local unsaved surface_workflow instead of
the saved platform workflow; replace the workflow argument in create_job from
surface_workflow to saved_surface_workflow so the job uses the persisted
workflow (with its platform _id). Specifically update the call that constructs
surface_job_response (which currently passes workflow=surface_workflow) to pass
workflow=saved_surface_workflow while leaving other args
(materials=[saved_slab], project_id, owner_id, prefix, compute) unchanged.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@other/materials_designer/workflows/surface_energy.ipynb`:
- Around line 580-585: When CLUSTER_NAME is provided but next(...) returns None,
cluster becomes None and later Compute(cluster=cluster, ...) or access to
compute.cluster.hostname will raise a vague error; update the logic after
determining cluster (the block that assigns cluster using CLUSTER_NAME and
clusters) to detect cluster is None and raise an explicit exception (e.g.,
ValueError) that includes the provided CLUSTER_NAME and the list of available
hostnames extracted from clusters so callers see a clear message; ensure this
check happens before creating Compute(cluster=cluster, queue=QUEUE_NAME,
ppn=PPN) or referencing compute.cluster.hostname.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 67cf3234-e9d0-4778-90b0-2434f94e2125
📒 Files selected for processing (1)
other/materials_designer/workflows/surface_energy.ipynb
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", | ||
| "compute = Compute(cluster=cluster, queue=QUEUE_NAME, ppn=PPN)\n", |
There was a problem hiding this comment.
cluster can be None when CLUSTER_NAME does not match any cluster.
If CLUSTER_NAME is set but next(...) finds no match, cluster becomes None and the subsequent Compute(cluster=cluster, ...) / compute.cluster.hostname will fail with a confusing error. Consider raising an explicit error with the available hostnames to aid debugging.
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
+ if cluster is None:
+ raise ValueError(
+ f"No cluster matches CLUSTER_NAME={CLUSTER_NAME!r}. "
+ f"Available: {[c['hostname'] for c in clusters]}"
+ )
else:
cluster = clusters[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/surface_energy.ipynb` around lines 580 -
585, When CLUSTER_NAME is provided but next(...) returns None, cluster becomes
None and later Compute(cluster=cluster, ...) or access to
compute.cluster.hostname will raise a vague error; update the logic after
determining cluster (the block that assigns cluster using CLUSTER_NAME and
clusters) to detect cluster is None and raise an explicit exception (e.g.,
ValueError) that includes the provided CLUSTER_NAME and the list of available
hostnames extracted from clusters so callers see a clear message; ensure this
check happens before creating Compute(cluster=cluster, queue=QUEUE_NAME,
ppn=PPN) or referencing compute.cluster.hostname.
Summary by CodeRabbit