Skip to content

fix: handle binned data in max res finder#274

Open
jokasimr wants to merge 3 commits intomainfrom
fix-maximum-resolution
Open

fix: handle binned data in max res finder#274
jokasimr wants to merge 3 commits intomainfrom
fix-maximum-resolution

Conversation

@jokasimr
Copy link
Copy Markdown
Contributor

@jokasimr jokasimr commented Apr 13, 2026

Fixes #273
and allows binned data in the max resolution finder function.

@jokasimr jokasimr requested a review from YooSunYoung April 13, 2026 07:39
Copy link
Copy Markdown
Member

@YooSunYoung YooSunYoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me...! Just some minor questions/suggestions.

events.bins.coords[c] = sc.bins_like(
events, sc.midpoints(events.coords[c])
)
events = events.bins.concat(image_dims)
Copy link
Copy Markdown
Member

@YooSunYoung YooSunYoung Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to concatenate it? Is it faster if we do?

Suggested change
events = events.bins.concat(image_dims)

Copy link
Copy Markdown
Contributor Author

@jokasimr jokasimr Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if we don't concatenate the bins then when we bin it again with new bin-edges in x and y we might run out of memory because of scipp/scipp#3872.

Does that answer the question?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does...!
So it'll be not necessary if the bug is fixed right...?

Neverthless, I'm okay with it. It shoudn't be blocked by the bug fix...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if the bug is fixed we should be fine without it.

Comment on lines 91 to 98
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sneaky typo bug......


events.bins.coords['x'] = sc.bins_like(events, sc.midpoints(events.coords['x']))
events.bins.coords['y'] = sc.bins_like(events, sc.midpoints(events.coords['y']))
events = events.bins.concat(['x', 'y'])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
events = events.bins.concat(['x', 'y'])

Here too. I think putting the event coordinate should be enough.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here, if we bin again in x and y when there are already x and y dimensions then we might run out of memory.

Comment on lines +85 to +86
del events.bins.coords['x']
del events.bins.coords['y']
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because they are usually dropped by reduction workflows?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's to make it look like typical detector event data in our workflows.

Typically we don't have pixel positions on the event data.

@jokasimr
Copy link
Copy Markdown
Contributor Author

Although, now that I'm thinking more about it, we don't typically have 'x' and 'y' coords on our detector data either 🤔 and this method currently requires that. There might be a better / simpler way to do it that does not require any coordinates on the input data.

@jokasimr
Copy link
Copy Markdown
Contributor Author

The old implementation had some issues.

  1. It required an x and an `` coordinate on the input data, something we typically don't have. We typically have a position, but it's not a dimension coordinate because it depends on both `x` and `y`, and that makes it difficult to work with in this context. The user could of course always add their own coordinates, but that is cumbersome.
  2. To be able to re-bin the event data on different pixel grids it had to add "aux" coordinates on the binned data array to keep track of what pixel each event belonged to originally. This is both quite inefficient for large event data, and it required either to modify the event data, or it would have required a copy, neither of which are not good options.
  3. If we bin into a number of pixels in each dimension that does not divide the original number of pixels then more of the "original" pixels will fall into some of the new pixels ("aliasing"). This isn't a big problem, because the effect is canceled in the normalization, but it can look ugly / surprising, so it's probably better to make sure that each "coarse" pixel contains the same number of "fine" pixels.

These issues are solved if we "fold" the original fine grid to coarser grids instead of re-binning the events, so that is the approach taken in the new implementation.

@YooSunYoung
Copy link
Copy Markdown
Member

YooSunYoung commented Apr 13, 2026

It required an x and an `` coordinate on the input data, something we typically don't have. We typically have a position, but it's not a dimension coordinate because it depends on both x and `y`, and that makes it difficult to work with in this context. The user could of course always add their own coordinates, but that is cumbersome.

One thing to be clear here:
We can't use the real position x and y of positions. (assuming that it is calculated from the NXtransformation chain.)
It should be x_pixel_offset and y_pixel_offset that we use as x and y.

It's usually easy to fold the data along x_pixel_offset and y_pixel_offset if the events are grouped by pixel ids but it's not guaranteed that we can do that.
We didn't think about it because we were quite sure that timepix and orca camera has the pixel ids (detector_numbers) in a way that we can easily fold into x and y.

But I think the new signature is much better...!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[essimaging] maximum_resolution_achievable should not hardcode "time" as the "third" dimension

2 participants