Avoid host round-trips for same-device backend copies by hfiguera · Pull Request #117 · elixir-nx/emlx

hfiguera · 2026-06-18T00:17:16Z

Summary

This adds a same-device fast path for Nx.backend_copy/3 when copying EMLX tensors back to EMLX.Backend.

Previously, same-device EMLX copies went through the generic binary path, which materializes tensor data through the host before rebuilding an EMLX tensor. This already preserved copy independence, but it was unnecessary when the source and target EMLX device were the same.

This PR adds EMLX.copy/1, backed by MLX's mlx::core::copy, and uses it for same-device EMLX copies. Cross-device copies still use EMLX.to_device/2.

It also makes same-device Nx.backend_transfer/3 a no-op, which matches transfer semantics more directly when the tensor is already on the requested backend/device.

I exposed EMLX.copy/1 because MLX already exposes copy and EMLX exposes similar low-level tensor primitives directly. If you prefer not to expand the public API in this PR, I can keep the copy helper internal to the backend.

Local check

On an M4 Max, copying a {2048, 2048} f32 tensor from EMLX CPU to the same EMLX CPU device preserved values and independent refs while avoiding the host binary round-trip.

Rough local timing for:

Nx.backend_copy(t, {EMLX.Backend, device: :cpu})

branch	median
`upstream/main`	~252 us
this branch	~13 us

For copy followed by materialization with Nx.to_binary/1:

branch	median
`upstream/main`	~264 us
this branch	~40 us

This is a small local sanity check, not a full benchmark suite.

Tests

mix test test/emlx/nx_test.exs --include metal

polvalente

Very nice!

hfiguera added 2 commits June 17, 2026 17:55

perf: avoid host round-trips for same-device copies

b8a4c32

test: cover backend copy and transfer semantics

e440a63

polvalente approved these changes Jun 18, 2026

View reviewed changes

polvalente merged commit 25f5340 into elixir-nx:main Jun 18, 2026
6 checks passed

hfiguera deleted the same-device-backend-copy-fast-path branch June 18, 2026 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid host round-trips for same-device backend copies#117

Avoid host round-trips for same-device backend copies#117
polvalente merged 2 commits into
elixir-nx:mainfrom
hfiguera:same-device-backend-copy-fast-path

hfiguera commented Jun 18, 2026

Uh oh!

polvalente left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hfiguera commented Jun 18, 2026

Summary

Local check

Tests

Uh oh!

polvalente left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants