Skip to content

legend_loc='on data' crashes with misleading ValueError for shapes, points, labels #624

@timtreis

Description

@timtreis

legend_loc="on data" crashes with misleading ValueError for shapes, points, labels

Environment: spatialdata-plot 0.3.4.dev (main, commit 5cfedc7), Python 3.13


Problem

Passing legend_loc="on data" to show() (a placement mode familiar to scanpy users) crashes with a misleading ValueError for all render types — shapes, points, and labels:

ValueError: Grouper and axis must be same length

The crash occurs deep in scanpy's _add_categorical_legend (imported at utils.py:1723), which tries to compute category centroids via a groupby on embedding coordinates. Those coordinates don't exist in the spatialdata context, causing a dimension mismatch.

No validation rejects "on data" before it reaches this crash. Interestingly, the _draw_channel_legend function (used for render_images channel legends) already handles "on data" gracefully by falling back to "right margin" with a warning — but the main categorical legend path for shapes/points/labels has no equivalent guard.


Minimal reproducible example

import matplotlib; matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np, pandas as pd, geopandas as gpd, anndata as ad
import dask; dask.config.set({"dataframe.query-planning": False})
from shapely.geometry import box
import spatialdata as sd
from spatialdata.models import ShapesModel, TableModel
import spatialdata_plot

shapes = ShapesModel.parse(gpd.GeoDataFrame(
    {"geometry": [box(i, 0, i+1, 1) for i in range(3)], "radius": [0.5]*3},
    geometry="geometry"
))
obs = pd.DataFrame({
    "region": pd.Categorical(["s"]*3),
    "instance_id": [0, 1, 2],
    "ct": pd.Categorical(["A", "B", "C"]),
})
table = TableModel.parse(ad.AnnData(X=np.zeros((3, 1)), obs=obs),
                         region="s", region_key="region", instance_key="instance_id")
sdata = sd.SpatialData(shapes={"s": shapes}, tables={"table": table})

fig, ax = plt.subplots()
sdata.pl.render_shapes("s", color="ct").pl.show(ax=ax, legend_loc="on data")
# ValueError: Grouper and axis must be same length

Expected behaviour

A UserWarning and graceful fallback to "right margin":

UserWarning: legend_loc='on data' is not supported for shapes/points/labels
(requires scatter embedding coordinates). Falling back to 'right margin'.

This matches the existing behaviour in _draw_channel_legend for image channel legends.

Actual behaviour

ValueError: Grouper and axis must be same length

The traceback points deep into scanpy internals with no mention of legend_loc or what the user should do instead.


Fix sketch

In _decorate_axs at utils.py:1723, add a guard before calling _add_categorical_legend:

if legend_loc == "on data":
    logger.warning(
        "legend_loc='on data' is not supported for shapes/points/labels "
        "(no scatter embedding coordinates). Falling back to 'right margin'."
    )
    legend_loc = "right margin"

This mirrors the fix already implemented in _draw_channel_legend for the image path.


Triage tier: Tier 3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions