Skip to content

Step 2: vectorize NormalizationManager._determine_sorted_rows#85

Open
mschwoer wants to merge 4 commits into
opt/step1-vectorize-protein-value-per-samplefrom
opt/step2-vectorize-determine-sorted-rows
Open

Step 2: vectorize NormalizationManager._determine_sorted_rows#85
mschwoer wants to merge 4 commits into
opt/step1-vectorize-protein-value-per-samplefrom
opt/step2-vectorize-determine-sorted-rows

Conversation

@mschwoer

Copy link
Copy Markdown
Contributor

Replaces the per-row .loc MultiIndex lookup inside sorted()'s key with a numpy nan-count + stable argsort (reproduces Python sorted()'s tie-breaking).

Estimated speedup: estimate stage 123.9 s → 116.9 s (~1.06×), single-core HYE benchmark.

Bit-identical protein + ion output (max_abs_diff=0); suite passes.

🤖 Generated with Claude Code

mschwoer and others added 3 commits June 15, 2026 13:39
Add regression tests before vectorizing the row-sort: index labels ordered by
ascending NaN-count with stable tie-breaking, and order preserved when no NaNs
are present. Tests pass against the unmodified baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the per-row .loc MultiIndex lookup inside sorted()'s key with a single
numpy nan-count over df.to_numpy() plus a stable argsort. np.argsort(kind=
"stable") reproduces Python sorted()'s tie-breaking. Bit-identical protein +
ion output (max_abs_diff=0); unit + full pytest suite pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mschwoer mschwoer requested a review from ammarcsj June 15, 2026 14:23
The numpy vectorization of _determine_sorted_rows (step 2) left the njit
_get_num_nas_in_row helper on NormalizationManager with no callers. The
identically-named helper on ProtvalCutter was already unused (it computes
nan counts inline). Remove both, and the now-unused numba import in
protein_intensity_estimation.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant