-
Notifications
You must be signed in to change notification settings - Fork 65
Label multiscales' reference to the source.image is broken; should label a coordinate system instead #480
Description
Currently, label multiscales
- MUST be in a descendant of a
labelsgroup which MUST be a direct child of the multiscale image they label - MUST be listed in the
ome.labelskey of thelabelsgroup - SHOULD contain an
ome.image-label(kebab 🙃) object which SHOULD contain animagekey with a path to zarr node containing the data for the multiscale image they label
However, because ome.multiscales is an array, this relationship between a label multiscale and the multiscale image it labels is not meaningfully specified. This would be resolved by RFC-6 Flatten Multiscale Array but enthusiasm for that was low at the Zurich hackathon 2025.
There are two concerns here:
- rendering: if we don't know which multiscale a label refers to, we don't know how to overlay it. The spec requires that the label multiscale has the same number of scale levels as the multiscale image, but does not require that they downscale in the same way or starting from the same resolution.
- provenance: knowing how labels were generated would be nice (although this is currently not captured and defining an ontology of labeling methods would be a big task); labels could feasibly be generated based on multiple multiscale images or on only a subset of one (e.g. a single channel).
The rendering case should be at least partially covered by RFC-5 Coordinate Systems. It shouldn't be a multiscale image which is labelled, it should a a coordinate system. This would allow us to remove a number of existing restrictions
- the label group's location, as it no longer belongs strictly to a single multiscale image
- (implicitly) matching the multiscale image's shape, base resolution, and downscaling scheme, as it becomes just another object in the scene (so we could label only a specific ROI, or at a coarse resolution for e.g. labelling cell bodies in an EM volume for connectomics)
However, we still want to be able to distinguish between a label multiscale and a multiscale image. We could achieve this with a simple flag in the Multiscale object:
This would provide a use case for multiscales being an array, and would allow us to additionally remove the restriction of having to list label multiscales in the ancestor labels Zarr group. Note that it moves the multiscale information about the label multiscale into a different metadata document.
Alternatively, we could rearrange the scene metadata/ discovery procedure to list multiscale images and labels but I haven't thought that through yet. In the majority of cases I imagine that multiscale images and their associated label multiscales probably want to be converted into a shared coordinate system, which could then be converted into other coordinate systems listed in the scene.
This does not solve the lack of provenance information, but we don't meaningfully store that today so no loss.
{ "ome": { "multiscales": [ { "datasets": [...], "coordinateTransformations": [...], ... }, { "datasets": [...], "coordinateTransformations": [...], "isLabel": true, ... } ] } }