graphistry · lmeyerov · Jun 29, 2026 · Jun 27, 2026 · Jun 28, 2026 · Jun 28, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -10,6 +10,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
 
 ### Changed
 - **GFQL Cypher parse memoization (perf)**: `parse_cypher` now memoizes its result (LRU over the deterministic lark parse+transform → immutable frozen AST). Repeated identical Cypher queries skip the ~15 ms parse — the dominant per-call cost of small queries (~50% of a Cypher call at 100k rows) — making end-to-end query latency ~1.3–1.7× faster at small/interactive sizes across pandas/polars/cuDF. Safe to share the cached AST: every Cypher AST node is `@dataclass(frozen=True)` and `compile_cypher_query` does not mutate the parsed tree; validation errors still raise and are not cached.
+- **GFQL structured whole-entity returns (#1650)**: Terminal Cypher `RETURN a` (whole node/edge) now emits **structured flattened columns** (`a.id`, `a.val`, `a.kind`, ...) instead of a single Cypher display string (`({id: 51, val: 51, kind: 'a'})`). The per-field columns already exist before projection, so this is "stop collapsing" rather than "rebuild": measured ~2–6.4× faster on pandas and ~2.7–4.3× on cuDF for whole-entity returns (the win grows with row count, since the old text render is O(rows) and the flat form is ~free), and the result is directly usable without re-parsing a string and survives JSON/CSV/Parquet/Arrow serialization and `plot()`. The human-readable Cypher display string remains available on demand via the `render_entity_text(result, alias)` presentation helper. OPTIONAL-MATCH / `WITH`-reentry / grouping paths that synthesize null/absent entities or still consume a single-column entity value are unchanged. Behavior change: callers that previously read the rendered display string from a terminal `RETURN a` column now receive flattened `a.*` columns. Edge case: a whole entity with NO fields to flatten — an entity with no id binding, no properties, and no type/label (in practice only an edge whose graph has no edge-id binding) — has no `{alias}.{field}` columns to emit, so it falls back to the single Cypher-display-text column under the bare alias (value is correct, e.g. `[]`); nodes always carry their id field and always flatten.
 
 ### Performance
 - **GFQL temporal-detection dtype gate (#1650)**: `order_detect_temporal_mode` now short-circuits for numeric/bool/complex columns, which can never hold temporal *text*, instead of running an `astype(str)` + multi-regex `fullmatch` scan on every comparison. Eliminates spurious row-wise stringification in `where_rows`/comparison paths whose output never contains entity-text. Byte-identical results; measured `where_rows` speedups ~3.1× (pandas) and ~4.4–13.3× (cuDF, scaling with row count). Does not address whole-entity `RETURN a` text rendering, which is tracked separately.

diff --git a/docs/source/gfql/cypher.rst b/docs/source/gfql/cypher.rst
@@ -309,6 +309,33 @@ Row And Row-Pipeline Forms
   including connected suffix projections in the current supported row-binding
   subset.
 
+Whole-Entity RETURN Output Shape
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A terminal ``RETURN`` of a whole node or relationship (``RETURN a`` rather than
+``RETURN a.prop``) emits **structured flattened columns**, one per field, named
+``<alias>.<field>``::
+
+    g.gfql("MATCH (a:Person) RETURN a")
+    # result._nodes columns: a.id, a.name, a.age, ...  (one column per field)
+
+This is directly usable (no string to re-parse) and survives JSON / CSV / Parquet /
+Arrow serialization and ``plot()``. To recover the human-readable Cypher display
+string (``(:Person {name: 'Alice'})``) on demand, use the presentation helper::
+
+    from graphistry.compute.gfql.cypher.result_postprocess import render_entity_text
+    text_series = render_entity_text(result, "a")            # nodes
+    text_series = render_entity_text(result, "r", table="edges")  # relationships
+
+Notes:
+
+- An aliased property projection of the same field (``RETURN a, a.val``) is
+  de-duplicated — you get a single ``a.val`` column, not two.
+- A whole entity with no fields to flatten (no id binding, no properties, no
+  type/label — in practice only an edge whose graph has no edge-id binding) has
+  nothing to flatten and falls back to a single Cypher-display-text column under the
+  bare alias. Nodes always carry an id field and always flatten.
+
 Procedure And Multi-Branch Forms
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

diff --git a/graphistry/compute/chain.py b/graphistry/compute/chain.py
@@ -1097,13 +1097,10 @@ def _chain_impl(
             )
             if added_edge_index:
                 final_edges_df = final_edges_df.drop(columns=[g._edge])
-                # Rebuild from `self` to restore the ORIGINAL edge binding (`self._edge`,
-                # often None — `g` carries the internal edge-index binding instead), but
-                # explicitly carry the materialized node-id binding `g._node`: for an
-                # edges-only input `self._node is None`, so rebuilding from `self` alone
-                # drops it, leaving the endpoint-reconciliation concat below to synthesize
-                # a `None`-named column (corrupt result + a void-block concat crash on
-                # newer pandas).
+                # `self` restores the original edge binding, but carry the materialized
+                # `g._node` explicitly: an edges-only `self._node is None` would drop the
+                # node binding, making the reconciliation concat synthesize a corrupt
+                # `None`-named column (and a void-block concat crash on newer pandas).
                 g_out = self.nodes(final_nodes_df, g._node).edges(final_edges_df, edge=original_edge)
             else:
                 g_out = g.nodes(final_nodes_df).edges(final_edges_df)

diff --git a/graphistry/compute/gfql/cypher/result_postprocess.py b/graphistry/compute/gfql/cypher/result_postprocess.py
@@ -1,7 +1,7 @@
 from __future__ import annotations
 
 from dataclasses import replace
-from typing import Any, Dict, Literal, Optional, TypedDict, cast
+from typing import Any, Dict, List, Literal, Optional, Set, TypedDict, cast
 
 import pandas as pd
 
@@ -116,6 +116,105 @@ def _format_edge_entities(df: DataFrameT, projection: ResultProjectionPlan) -> S
     )
 
 
+def _label_flag_columns(df: DataFrameT) -> list[str]:
+    return [
+        str(col)
+        for col in df.columns
+        if str(col).startswith("label__")
+        and str(col).split("label__", 1)[1] not in {"<NA>", "None", "nan"}
+    ]
+
+
+def _flat_entity_field_names(
+    source_rows_df: DataFrameT, projection: ResultProjectionPlan, id_column: Optional[str]
+) -> list[str]:
+    """Ordered field names for a flattened whole-entity projection (#1650).
+
+    Mirrors the renderer's column selection (``node_property_columns`` /
+    ``edge_property_columns`` honor ``exclude_columns`` so sibling aliases and
+    engine-internal columns are not pulled in), then prepends the entity id and
+    (nodes) appends ``label__*`` flags / (edges) the ``type`` column so
+    :func:`render_entity_text` can losslessly reconstruct the Cypher form.
+    """
+    alias_col = projection.alias
+    if projection.table == "nodes":
+        prop_cols = node_property_columns(source_rows_df, alias_col, projection.exclude_columns)
+        # label sources for faithful reconstruction: label__* flags and/or the
+        # node ``type`` column (both consumed by _node_label_text).
+        extra = _label_flag_columns(source_rows_df)
+        if "type" in source_rows_df.columns:
+            extra = [*extra, "type"]
+    else:
+        prop_cols = edge_property_columns(source_rows_df, alias_col, projection.exclude_columns)
+        extra = ["type"] if "type" in source_rows_df.columns else []
+
+    fields: list[str] = []
+    for col in [id_column, *prop_cols, *extra]:
+        if col is not None and col in source_rows_df.columns and col not in fields:
+            fields.append(str(col))
+    return fields
+
+
+def _flat_entity_columns(
+    source_rows_df: DataFrameT,
+    projection: ResultProjectionPlan,
+    output_name: str,
+    id_column: Optional[str],
+) -> Dict[str, SeriesT]:
+    """Structured (flattened) whole-entity projection (issue #1650).
+
+    Emit one ``{output_name}.{field}`` column per aliased field instead of
+    collapsing the entity into a single Cypher display string. The per-field
+    columns already exist on ``source_rows_df`` (gathered by
+    ``_projection_alias_rows``), so this is "stop collapsing", not "rebuild":
+    near-free, lossless, and directly usable without re-parsing a string.
+    """
+    return {
+        f"{output_name}.{field}": cast(SeriesT, source_rows_df[field])
+        for field in _flat_entity_field_names(source_rows_df, projection, id_column)
+    }
+
+
+def render_entity_text(
+    result: Plottable, alias: str, *, table: Literal["nodes", "edges"] = "nodes"
+) -> SeriesT:
+    """Render a structured whole-entity projection back to Cypher display text.
+
+    Presentation helper: given a result whose ``RETURN <alias>`` was emitted as
+    flattened ``{alias}.{field}`` columns (the default since #1650), reconstruct
+    the Cypher display string (``(:Label {..})`` / ``[:TYPE {..}]``). Used by the
+    conformance/TCK driver and by callers who want the human-readable form. The
+    structured data path itself never pays this cost.
+    """
+    rows_df = cast(DataFrameT, result._nodes)
+    if rows_df is None:
+        raise ValueError("result has no _nodes frame to render")
+    prefix = f"{alias}."
+    field_cols = [col for col in rows_df.columns if str(col).startswith(prefix)]
+    if not field_cols:
+        raise ValueError(f"no flattened columns found for alias {alias!r}")
+    frame = cast(
+        DataFrameT,
+        rows_df[field_cols].rename(columns={col: str(col)[len(prefix):] for col in field_cols}),
+    )
+    # An OPTIONAL-MATCH miss flattens to a row whose fields are all null; such
+    # rows must render as null, not "()". Track presence (any field non-null).
+    present: Optional[SeriesT] = None
+    for field in frame.columns:
+        not_na = cast(SeriesT, frame[field].notna())
+        present = not_na if present is None else cast(SeriesT, present | not_na)
+    # _format_*_entities anchors length/null on a bare alias column; render every
+    # row, then null absent rows below.
+    frame = cast(DataFrameT, frame.assign(**{alias: True}))
+    projection = ResultProjectionPlan(alias=alias, table=table, columns=(), exclude_columns=())
+    rendered = _format_node_entities(frame, projection) if table == "nodes" else _format_edge_entities(frame, projection)
+    if present is not None and hasattr(rendered, "where"):
+        # Null absent rows. ``other=None`` fills NaN/None (valid pandas/cuDF);
+        # the pandas-stubs ``where`` overload is stricter than runtime here.
+        rendered = cast(SeriesT, rendered.where(present, None))  # type: ignore[call-overload]
+    return rendered
+
+
 def _project_property_column(
     rows_df: DataFrameT,
     *,
@@ -124,10 +223,8 @@ def _project_property_column(
     if column.source_name is None or column.source_name not in rows_df.columns:
         raise ValueError(f"projection source column not found: {column.source_name!r}")
     series = cast(SeriesT, rows_df[column.source_name])
-    # Temporal-constructor normalization only applies to STRING values; numeric/bool/
-    # complex columns can never hold temporal text, so skip the (otherwise spurious)
-    # ``astype(str)`` + detection scan and return the column as-is — byte-identical,
-    # since the scan returns None for these dtypes. Mirrors the #1650/#1651 gate.
+    # Temporal-constructor normalization only applies to strings; numeric/bool/complex
+    # can't hold temporal text, so skip the astype(str)+scan (byte-identical). #1650 gate.
     if is_non_textual_scalar_dtype(getattr(series, "dtype", None)):
         return series
     if hasattr(series, "astype") and hasattr(cast(SeriesT, series.astype(str)), "str"):
@@ -185,7 +282,17 @@ def _projection_alias_rows(
     return None
 
 
-def apply_result_projection(result: Plottable, projection: ResultProjectionPlan) -> Plottable:
+def apply_result_projection(
+    result: Plottable, projection: ResultProjectionPlan, *, structured: bool = True
+) -> Plottable:
+    """Project Cypher RETURN columns onto ``result._nodes``.
+
+    ``structured=True`` (#1650 default) emits whole-entity returns as flattened
+    ``{alias}.{field}`` columns. ``structured=False`` keeps the legacy single
+    Cypher-display-string column; the reentry / OPTIONAL-MATCH null-fill machinery
+    (which still assumes a single-column entity value) opts out via this flag until
+    it is unified onto the structured path.
+    """
     rows_df = cast(DataFrameT, getattr(result, "_nodes", None))
     if rows_df is None:
         return result
@@ -194,27 +301,54 @@ def apply_result_projection(result: Plottable, projection: ResultProjectionPlan)
         return result
     projected_data: Dict[str, SeriesT] = {}
     projected_entity_meta: Dict[str, WholeRowProjectionMeta] = {}
+    output_columns: list[str] = []
     for column in projection.columns:
         if column.kind == "whole_row":
             source_alias = column.source_name or projection.alias
             source_rows_df = _projection_alias_rows(rows_df, alias=source_alias)
             if source_rows_df is None or source_alias not in source_rows_df.columns:
                 raise ValueError(f"whole-row projection source alias not found: {source_alias!r}")
             source_projection = projection if source_alias == projection.alias else replace(projection, alias=source_alias)
-            projected_data[column.output_name] = (
-                _format_node_entities(source_rows_df, source_projection)
-                if projection.table == "nodes"
-                else _format_edge_entities(source_rows_df, source_projection)
-            )
             id_column = getattr(result, "_node" if source_projection.table == "nodes" else "_edge", None)
+            flat_columns = (
+                _flat_entity_columns(source_rows_df, source_projection, column.output_name, id_column)
+                if structured
+                else {}
+            )
+            if structured and flat_columns:
+                # Structured (flattened) emission (#1650): one column per field; text
+                # stays available via render_entity_text().
+                projected_data.update(flat_columns)
+                output_columns.extend(flat_columns.keys())
+            elif structured:
+                # No fields to flatten: the synthesized absent-entity row (OPTIONAL miss
+                # / reentry no-match, a single ``{alias: None}`` column) or a field-less
+                # real entity. Emit the single-column text form (renders to None / []).
+                projected_data[column.output_name] = (
+                    _format_node_entities(source_rows_df, source_projection)
+                    if source_projection.table == "nodes"
+                    else _format_edge_entities(source_rows_df, source_projection)
+                )
+                output_columns.append(column.output_name)
+            else:
+                projected_data[column.output_name] = (
+                    _format_node_entities(source_rows_df, source_projection)
+                    if source_projection.table == "nodes"
+                    else _format_edge_entities(source_rows_df, source_projection)
+                )
+                output_columns.append(column.output_name)
             if id_column is not None and id_column in source_rows_df.columns:
                 projected_entity_meta[column.output_name] = {
                     "table": source_projection.table,
                     "alias": source_projection.alias,
                     "id_column": id_column,
+                    # Snapshot the id Series: the bounded-reentry path recovers
+                    # carried node identities from this meta and must not alias the
+                    # live working frame (see #1356).
                     "ids": cast(SeriesT, source_rows_df[id_column]).copy(),
                 }
         else:
+            output_columns.append(column.output_name)
             if column.kind == "property":
                 property_rows_df = alias_rows_df
                 if (
@@ -226,14 +360,26 @@ def apply_result_projection(result: Plottable, projection: ResultProjectionPlan)
                 projected_data[column.output_name] = _project_property_column(property_rows_df, column=column)
             else:
                 projected_data[column.output_name] = _project_expr_column(result, rows_df, column=column)
+    # De-dup output columns (#1650): a flattened whole entity `a` (-> a.id, a.val, ...)
+    # collides by name with an explicit property projection (`RETURN a, a.val`). Both
+    # read the same source field (dotted aliases are rejected), so values are identical
+    # — keep first occurrence; a duplicate name would drop data on to_dict/serialization.
+    if len(set(output_columns)) != len(output_columns):
+        seen: Set[str] = set()
+        deduped: List[str] = []
+        for c in output_columns:
+            if c not in seen:
+                seen.add(c)
+                deduped.append(c)
+        output_columns = deduped
     projected_rows = alias_rows_df
     if rows_df.__class__.__module__.startswith("cudf") and any(isinstance(value, pd.Series) for value in projected_data.values()):
         projected_rows = cast(DataFrameT, cast(Any, alias_rows_df).to_pandas())
         projected_data = {
             key: cast(SeriesT, value.to_pandas() if hasattr(value, "to_pandas") else value)
             for key, value in projected_data.items()
         }
-    projected_nodes = cast(DataFrameT, projected_rows.assign(**projected_data)[[column.output_name for column in projection.columns]])
+    projected_nodes = cast(DataFrameT, projected_rows.assign(**projected_data)[output_columns])
 
     out = result.bind()
     out._nodes = projected_nodes

diff --git a/graphistry/compute/gfql_unified.py b/graphistry/compute/gfql_unified.py
@@ -856,7 +856,15 @@ def _execute_compiled_query_chain_non_union(
             empty_result_row=compiled_query.empty_result_row,
         )
     if compiled_query.result_projection is not None:
-        result = apply_result_projection(result, compiled_query.result_projection)
+        # OPTIONAL null-fill / row-guard still consumes a single-column entity value,
+        # so those keep the legacy text form; plain terminal RETURN flattens (#1650).
+        structured_projection = (
+            compiled_query.optional_projection_row_guard is None
+            and compiled_query.optional_null_fill is None
+        )
+        result = apply_result_projection(
+            result, compiled_query.result_projection, structured=structured_projection
+        )
     if compiled_query.optional_projection_row_guard is not None:
         expected_rows = 1
         for base_chain in compiled_query.optional_projection_row_guard.base_chains:
@@ -892,6 +900,7 @@ def _execute_compiled_query_chain_non_union(
                 context,
             ),
             compiled_query.optional_null_fill.alignment_projection,
+            structured=False,
         )
         result = _apply_optional_null_fill(
             result,