Add ds.traj.filter() for outlier removal (speed and n-sigma sliding)#201
Conversation
Implements filter(method, ...) on the traj accessor: - 'speed': masks positions where speed to next point exceeds max_speed [m/s]. Uses per-step time differences (not a fixed scalar) so it works correctly on non-uniformly sampled drifter data. - 'nsigma_sliding': masks lat/lon where either deviates more than nsigma standard deviations from a sliding-window local mean (same algorithm as sliding_filter_nsigma in readers/omb.py, verified by test). Implementation follows the Traj1d/Traj2d delegation pattern: - _nsigma_sliding_filter() helper in traj1d.py - Traj1d.filter() contains the logic - Traj2d.filter() delegates via trajectories().map(d.traj.to_1d().traj.filter(...).traj.to_2d()) - @AbstractMethod stub with full docstring in Traj Also: - Add Dataset.traj.filter to api.rst - Add filter demonstration section to examples/example_drifters.py - Fix examples to unpack (paths, ax) from traj.plot() after plot.lines() was updated to return both values - Add trajectories().map() delegation pattern to copilot-instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…PS positions The previous implementation masked the *source* of a high-speed jump (the last valid position before a bad run) while leaving the entire run of outliers intact, because consecutive stuck positions have speed=0 between them. New algorithm: walk through non-NaN positions in order, comparing each to the last *accepted* good position. When a position is too far away (speed > max_speed) it is masked and the last-good pointer is NOT advanced. This correctly clears entire consecutive runs of invalid positions (e.g. GPS no-fix sentinel values near (0,0)) in a single O(N) pass, without falsely masking the valid positions on either side. Also fix IndexError when time variable is 1D (shared axis in Traj1d) - times must not be indexed by trajectory index. Verified on real SFY drifter data (tracks.nc): all (1.5e-7, 1.5e-7) no-fix positions are removed and the map extent stays around Moskenes. New test: test_filter_speed_clears_stuck_gps_run verifies the full run is cleared and the bracketing valid positions are preserved. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cartopy's _draw_gridliner builds a sgeom.Polygon from the map boundary path vertices. In Shapely 2.x the underlying LinearRing constructor requires the ring to be closed (first == last vertex), but older Cartopy builds do not ensure this, causing: GEOSException: Points of LinearRing do not form a closed linestring Wrap _draw_gridliner on each created gridliner instance so that this rendering error is silently swallowed rather than aborting figure saving. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
This filter method looks useful and convenient. However, this is a fairly specific method (though with general name) that partly overlaps with more generic and flexible functionality such as In principle this filter method could be separated in two:
|
|
Yes, I first tried to use ds.traj.speed() to remove jumps, but it didn't work very well. Then there's the nsigma filter from @jerabaul29's reader, which was only available to the OMB reader. It seems that there are specific edge cases which are difficult to combine the basic operations into, and then a bit difficult to remember how to use. So I think it would be a useful method to many, and more intuitive to remember than combining the other methods? Especially, since it is now more advanced than what is easily possible by combination of the others. I agree that it could be split, perhaps it could also (in the future) take a custom method or list of indices as it's filter, but I think it would be useful to keep both speed and nsigma as built-in. It may be that the nsigma one should be the default one. Right now I'm actually using both: ds.traj.filter().traj.filter('nsigma_sliding') and it works very well. |
Implements filter(method, ...) on the traj accessor:
Uses per-step time differences (not a fixed scalar) so it works correctly
on non-uniformly sampled drifter data.
standard deviations from a sliding-window local mean (same algorithm as
sliding_filter_nsigma in readers/omb.py, verified by test).
Implementation follows the Traj1d/Traj2d delegation pattern:
Also:
was updated to return both values
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com