[Core] Layout 'F' support

## Overview

Comprehensive audit of F-contig (Fortran / column-major) layout support across the NumSharp API surface revealed 18 concrete gaps across 6 areas: axis-reductions, element-wise rounding, manipulation ops, .npy file I/O, a pre-existing fancy-write bug, and one missing-function API gap. Tests for every gap have landed as `[OpenBugs]` in `test/NumSharp.UnitTest/View/OrderSupport.OpenBugs.Tests.cs` (Sections 41–51, 86 tests total, 18 failing = these gaps).

## Problem

NumSharp already has correct F-contig support in the three central element-wise dispatchers (`ExecuteBinaryOp`, `ExecuteUnaryOp`, `ExecuteComparisonOp`) and in creation/conversion APIs (Groups A–I of the earlier F-order pass). What's missing is F-preservation in operations that don't route through those dispatchers — axis reductions, `np.around`/`np.round_`, `np.squeeze`, and `.npy` I/O — plus a pre-existing indexing bug that surfaces under any layout.

These gaps cause NumPy code that depends on layout preservation to silently flip to C-contig in NumSharp, breaking round-trip scenarios (e.g., saving F-ordered weights and loading them back), interop with downstream code that checks `flags['F_CONTIGUOUS']`, and performance-sensitive code that picks algorithms based on layout.

## Proposal

Fix each gap below; check the matching test section once landed.

- [ ] **Axis reductions preserve F-contig** (Section 41) — 4 tests
  `sum`/`mean`/`nansum` on 3-D+ F-contig inputs with `keepdims=True`/`False` flip to C-contig. Root cause: the axis reduction dispatchers write result in linear C-order. Same post-hoc `.copy('F')` via `ShouldProduceFContigOutput` helper that the element-wise path uses would apply here. Confirmed affects: `np.sum`, `np.mean`, `np.nansum` (and by structural similarity likely `std`/`var`/`prod`/`min`/`max`).
- [ ] **3-D+ reductions on all dtypes** (Sections 49, 50) — 2 tests
  Same gap confirmed for Decimal (scalar-full path) and 6-D double. Proves the bug is rank-agnostic and dtype-agnostic — a dispatcher-level fix covers all of them.
- [ ] **`np.around` / `np.round_` preserve F-contig** (Section 47) — 3 tests
  Element-wise rounding doesn't route through the central dispatcher's F-preservation helper. 2-D and 3-D both affected. Fix: route through the same helper or apply post-hoc `.copy('F')`.
- [ ] **`np.squeeze` preserves F-contig** (Section 45) — 1 test
  `squeeze(F(2,1,3))` returns `(2,3)` C-contig in NumSharp; NumPy returns F-contig. The squeeze rebuilds the shape without carrying F-strides through.
- [ ] **`np.repeat` supports `axis` parameter** (Section 45) — 1 test
  Not an F-order bug per se — `src/NumSharp.Core/Manipulation/np.repeat.cs` always `ravel()`s first; axis parameter is absent from the public API. Add the axis-preserving overload.
- [ ] **`.npy` `fortran_order` header flag** (Section 46) — 3 tests
  - `src/NumSharp.Core/APIs/np.save.cs:172` hardcodes `'fortran_order': False` regardless of the NDArray's layout. Should write `True` when source is F-contig.
  - `src/NumSharp.Core/APIs/np.load.cs:322` throws `Exception` on `'fortran_order': True`. Should handle it — read F-strided bytes into an F-contig NDArray.
  - Round-trip of F-contig through `np.save` + `np.load` should preserve both values and layout.
- [ ] **Fancy-write `SetIndicesND` assertion bug** (Section 51) — 3 tests
  Pre-existing, not F-order specific. `src/NumSharp.Core/Selection/NDArray.Indexing.Selection.Setter.cs:552` asserts `dstOffsets.size == values.size`. For 2-D+ targets, `dstOffsets` counts selected rows/indices while `values` counts elements. Scalar RHS, matching-shape array RHS, and F-contig variants all trigger the same crash. Fix should compare to broadcast-target element count, not `values.size`.
- [ ] **`np.sort` API** (Section 42) — 1 test
  Listed in Missing Functions. Only `argsort` exists. Implement with optional layout preservation.

## Evidence

All evidence is live in `test/NumSharp.UnitTest/View/OrderSupport.OpenBugs.Tests.cs`. Each failing test carries NumPy 2.x reference values derived from actual `numpy` runs and a precise `[OpenBugs]` comment pointing at the root cause.

Key representative reproductions:

```csharp
// Section 41 — Axis reduction 3-D F-contig (currently flips to C)
var f3 = np.empty(new Shape(2L, 3L, 4L), order: 'F', dtype: typeof(double));
var r = np.sum(f3, axis: 0, keepdims: true);
// NumPy: r.flags = {C=0, F=1}  |  NumSharp: {C=1, F=0}

// Section 46 — .npy save header (currently always writes C)
np.Save((Array)f, stream);  // where f.Shape.IsFContiguous == true
// NumPy: header contains \"'fortran_order': True\"
// NumSharp: header contains \"'fortran_order': False\" (hardcoded)

// Section 51 — Fancy-write assertion bug (unrelated to F-order)
var c = np.arange(12).reshape(3, 4).astype(typeof(int));
c[np.array(new[] { 0, 2 })] = 99;
// Debug.Assert fires: dstOffsets.size=2 vs values.size=8
```

Full `[OpenBugs]` matrix:

| Section | Area | Tests | Failing |
|--------:|------|------:|--------:|
| 41 | Reductions keepdims | 17 | 4 |
| 42 | `np.sort` | 1 | 1 |
| 43 | matmul/dot/outer/convolve | 11 | 0 (parity confirmed) |
| 44 | Broadcasting | 5 | 0 (parity confirmed) |
| 45 | Manipulation ops | 20 | 2 |
| 46 | .npy I/O | 4 | 3 |
| 47 | around/round_ | 6 | 3 |
| 49 | Decimal scalar-full | 10 | 1 |
| 50 | Edge cases | 12 | 1 |
| 51 | Fancy-write repros | 5 | 3 |
| **Total** | | **86** | **18** |

CI-filter suite status at time of filing: 6502 passing / 0 failed with `[OpenBugs]` excluded.

## Scope / Non-goals

- **In scope:** post-hoc `.copy('F')` strategy for the axis reduction dispatcher and around/round_ (mirror of the existing element-wise fix); squeeze shape rebuild fix; `np.repeat` axis overload; `.npy` fortran_order header + loader; `SetIndicesND` assertion fix; `np.sort` implementation.
- **Out of scope:** Rewriting ILKernelGenerator (~21K lines) to accept F-strided output directly. Post-hoc copy is the correctness-first path; a direct-F-write kernel rewrite is a much larger performance-motivated task.
- **Out of scope:** `np.tile`, `np.flip`, `np.where` — already on the Missing Functions list.
- **Out of scope:** F-order support in `np.random.*` generators.

## Breaking changes

None expected. Every fix brings NumSharp's output closer to NumPy 2.x, matching the existing `[OpenBugs]` test expectations. Code that currently reads `result.Shape.IsContiguous == true` after an F-contig reduction would need to handle F-contig output too — but that's a bug in the caller; NumSharp's current behavior diverges from NumPy.

## Related issues

- #590 (10,000 tests / 100% coverage goal) — the 86 new tests contribute to that target.
- #597 (np.save/load rewrite) — Section 46 gaps should be addressed as part of that rewrite.
- #591 (ndim up to int.MaxValue) — Section 50 confirmed 6-D F-contig flag computation works correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Layout 'F' support #610

Overview

Problem

Proposal

Evidence

Scope / Non-goals

Breaking changes

Related issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Section	Area	Tests	Failing
41	Reductions keepdims	17	4
42	`np.sort`	1	1
43	matmul/dot/outer/convolve	11	0 (parity confirmed)
44	Broadcasting	5	0 (parity confirmed)
45	Manipulation ops	20	2
46	.npy I/O	4	3
47	around/round_	6	3
49	Decimal scalar-full	10	1
50	Edge cases	12	1
51	Fancy-write repros	5	3
Total		86	18

[Core] Layout 'F' support #610

Description

Overview

Problem

Proposal

Evidence

Scope / Non-goals

Breaking changes

Related issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions