Skip to content

Avoid host round-trips for same-device backend copies#117

Merged
polvalente merged 2 commits into
elixir-nx:mainfrom
hfiguera:same-device-backend-copy-fast-path
Jun 18, 2026
Merged

Avoid host round-trips for same-device backend copies#117
polvalente merged 2 commits into
elixir-nx:mainfrom
hfiguera:same-device-backend-copy-fast-path

Conversation

@hfiguera

Copy link
Copy Markdown
Contributor

Summary

This adds a same-device fast path for Nx.backend_copy/3 when copying EMLX tensors back to EMLX.Backend.

Previously, same-device EMLX copies went through the generic binary path, which materializes tensor data through the host before rebuilding an EMLX tensor. This already preserved copy independence, but it was unnecessary when the source and target EMLX device were the same.

This PR adds EMLX.copy/1, backed by MLX's mlx::core::copy, and uses it for same-device EMLX copies. Cross-device copies still use EMLX.to_device/2.

It also makes same-device Nx.backend_transfer/3 a no-op, which matches transfer semantics more directly when the tensor is already on the requested backend/device.

I exposed EMLX.copy/1 because MLX already exposes copy and EMLX exposes similar low-level tensor primitives directly. If you prefer not to expand the public API in this PR, I can keep the copy helper internal to the backend.

Local check

On an M4 Max, copying a {2048, 2048} f32 tensor from EMLX CPU to the same EMLX CPU device preserved values and independent refs while avoiding the host binary round-trip.

Rough local timing for:

Nx.backend_copy(t, {EMLX.Backend, device: :cpu})
branch median
upstream/main ~252 us
this branch ~13 us

For copy followed by materialization with Nx.to_binary/1:

branch median
upstream/main ~264 us
this branch ~40 us

This is a small local sanity check, not a full benchmark suite.

Tests

  • mix test test/emlx/nx_test.exs --include metal

@polvalente polvalente left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@polvalente polvalente merged commit 25f5340 into elixir-nx:main Jun 18, 2026
6 checks passed
@hfiguera hfiguera deleted the same-device-backend-copy-fast-path branch June 18, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants