Add --parallel-io CLI flag for concurrent loop execution by vlap · Pull Request #129 · uwefladrich/scriptengine

vlap · 2026-06-15T19:25:28Z

Summary

Adds --parallel-io flag to the se CLI that runs loop iterations concurrently using ThreadPoolExecutor
Same YAML scripts work unchanged — parallelism is opt-in via CLI invocation (se --parallel-io script.yml)
Adds fast-path in Jinja rendering: skips template parsing for strings without { markers
Warns when tasks return context updates inside parallel loops (they are discarded)
Includes benchmark script and two unit tests

Motivation

EC-Earth4 setup scripts contain many I/O-bound loops (copying inidata, weights, restart files). On HPC parallel filesystems like Lustre, these loops are bottlenecked by per-file latency, not bandwidth. Threading allows multiple file operations to overlap.

Benchmark results

Tested on two HPC Lustre filesystems with the included test-se-run/bench/run-bench.sh:

Platform	base.copy	base.move
MN5 (BSC), 10 files, 2.6 GB	38% speedup	no benefit (same-fs rename)
hpc2020 (ECMWF), 10 files, 3 GB	60% speedup	no benefit (same-fs rename)

Design decisions

--parallel-io, not --parallel: name communicates the constraint — only I/O-bound, independent iterations benefit.
ThreadPoolExecutor: GIL is released during I/O syscalls (shutil.copy2, etc.), so threads provide real concurrency for file operations without the overhead of multiprocessing.
No context accumulation in parallel mode: loop iterations are treated as independent. A warning is logged if a task returns a context update inside a parallel loop.

Test plan

All 170 existing tests pass
Two new tests: test_parallel_loop, test_parallel_loop_with_context_var
Benchmarked on MN5 (BSC) and hpc2020 (ECMWF)

🤖 Generated with Claude Code

Enable thread-based parallelism for loop iterations via `se --parallel`. Uses ThreadPoolExecutor to run I/O-bound loop bodies (base.copy, base.move) concurrently. The same YAML scripts work unchanged — parallelism is activated purely by the CLI flag. Also adds a fast-path in Jinja rendering that skips template parsing for strings without template markers, reducing per-iteration overhead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Self-contained bench script that measures base.copy loop speedup from thread parallelism on HPC Lustre filesystems. Tested on MN5 (BSC) and hpc2020 (ECMWF) with 51-58% speedup for I/O-bound copy loops. Usage: bash test-se-run/bench/run-bench.sh <dir_with_nc_files> [N] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Rename CLI flag to --parallel-io to clarify it's for I/O-bound loops - Warn when tasks return context updates inside parallel loops (they are discarded since iterations run independently) - Simplify and add base.move to benchmark script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

uwefladrich · 2026-06-23T13:18:11Z

+        if "{" not in string_arg:
+            return string_arg


I think that this little change slipped in from #128 and does not really belong here, does it?

uwefladrich · 2026-06-23T13:53:07Z

Hi @vlap,

Thanks a lot for the PR! Parallelisation has been on my wish list from the start of ScriptEngine. In fact, the abstraction of the actual "engine" that runs the scripts has partly been motivated by the option of implementing an advanced engine that executes tasks in parallel in mind. However, it has never been realised.

The bottleneck you describe makes sense to me, given the number of files and characteristics of the file system. So parallelisation seems a reasonable approach.

When SE is executing tasks, an important step is to update the context consistently. Which is why this is done in the engine, not the task code itself. When executing tasks in parallel (in a loop or otherwise), the context update is not trivial. This is the same issue as for shared variables in other parallel languages.

You chose to ignore all context updates from parallel tasks, which is of course an option to handle potentially conflicting updates. However, it also deprives all tasks in a parallel loop to communicate any data, conflicting or not.

Moreover, the parallelisation is controlled by a command line switch, which works script-globally, affecting all loops. This changes semantics quite heavily, in my opinion, as context updates of all tasks in all loops are suddenly lost, compared to a run without --parallel-io. At least that's my understanding of the changes. I wonder if the full set of EC-Earth run scripts would work with this feature switched on?

Last, I wonder why the command line switch is called --parallel-io? The implementation of this feature does not seem to be io-specific? Why not, for example, --parallel-loops?

All this being said, I honestly appreciate your work and I also realise that I should not have the final word, because you and the ECE users are using SE much more than I do (at least at the moment). So my intention is more to spark a discussion within the ECE users of SE about this. Maybe you can ping the right people.

vlap and others added 3 commits June 15, 2026 20:54

vlap changed the title ~~Add --parallel CLI flag for concurrent loop execution~~ Add --parallel-io CLI flag for concurrent loop execution Jun 15, 2026

uwefladrich reviewed Jun 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --parallel-io CLI flag for concurrent loop execution#129

Add --parallel-io CLI flag for concurrent loop execution#129
vlap wants to merge 3 commits into
uwefladrich:masterfrom
vlap:perf/parallel-loops

vlap commented Jun 15, 2026 •

edited

Loading

Uh oh!

uwefladrich Jun 23, 2026

Uh oh!

uwefladrich commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vlap commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Benchmark results

Design decisions

Test plan

Uh oh!

uwefladrich Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

uwefladrich commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vlap commented Jun 15, 2026 •

edited

Loading