align Mahalanobis and DA PTP with manuscript; harmonize use_empirical_variance default; bump to 0.8.0#11
Merged
Conversation
Three statistical corrections + one default-value harmonization;
all called out in the manuscript-vs-implementation review.
* Mahalanobis denominator sums the two posterior covariances
(cov1 + cov2) instead of averaging them. The manuscript defines
D(a,b) = sqrt((mu_a - mu_b)^T (Sigma_a + Sigma_b)^(-1) (mu_a - mu_b)),
i.e. the variance of the difference of two independent posterior
estimators. The GP-only branch in differential_expression.py was
computing (cov1 + cov2) / 2 even though the sample-variance branch
at the same call site was already summing.
Effect: D shrinks by sqrt(2) in the GP-only regime. Relative
rankings unchanged; FDR re-calibrates against the null. The
resource-estimation comment is brought in line too.
* Differential-abundance posterior tail probability is one-sided:
PTP = Phi(-|z|), matching the manuscript prose ("probability that
the true density difference has the opposite sign to the estimated
change") and the one-sided formula. The implementation was
emitting 2 * Phi(-|z|) via a spurious + np.log(2).
Effect: PTPs are halved; neg_log10_fold_change_ptp increases by
log10(2). A previously-published cutoff of PTP < 1e-3 corresponded
to |z| >= 3.29 (two-sided) and now corresponds to |z| >= 3.09
(one-sided). Re-tune any ptp_threshold chosen against 0.7.0.
* use_empirical_variance defaults to False everywhere. The
recommended kompot.de() path already defaulted to False via
GPSettings; the deprecated wrappers (compute_differential_expression,
compute_smoothed_expression), DifferentialExpression.__init__,
ExpressionModel.__init__, smooth_expression(gp=None), and
smooth_config_template.yaml all defaulted to True. Now consistent
with the manuscript's "empirical variance is disabled by default"
statement. Opt in via use_empirical_variance=True / GPSettings.
Regression coverage in tests/test_audit_fixes.py:
* monkeypatches compute_mahalanobis_distances to capture the
combined-covariance kwarg from a controlled-kernel fit and
asserts equality with cov1 + cov2 (not the average);
* replicates the ln_ptp formula and asserts equality with the
closed-form Phi(-|z|) both directly and end-to-end through
DifferentialAbundance.fit().predict();
* introspects every public entry point's signature and asserts
the default is False; also parses both CLI YAML templates.
tests/test_empirical_variance.py::test_default_is_on was the only
pre-existing test pinning the old default; it has been flipped and
renamed test_default_is_off.
pyproject.toml and kompot/version.py bumped to 0.8.0; CHANGELOG
entry under [0.8.0] - 2026-05-28 documents the migration notes
above. Release tag is not cut by this commit.
settylab-dotto-bot Bot
added a commit
that referenced
this pull request
May 29, 2026
Revert: audit-fix merge (#11) — premature; code stays on dominik/audit-fixes-2026-05-28
This was referenced May 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Brings the implementation into agreement with the manuscript on
three statistical points and harmonizes a default-argument
inconsistency that had crept in across deprecated wrappers.
All three are intentional and minor-version-worthy. Mahalanobis
distances in the GP-only regime contract by
√2, DA PTPs arehalved, and the deprecated wrappers that exposed
use_empirical_varianceflip their default from
TruetoFalse. Relative rankings ofgenes are unchanged because the scale factors apply uniformly,
and FDR thresholds re-calibrate against the null distribution.
Absolute thresholds tuned against 0.7.0 need re-tuning.
Mahalanobis denominator — sum, not average
The manuscript defines the gene-wise distance as
D(a,b) = sqrt((μ_a − μ_b)^T (Σ_a + Σ_b)^(-1) (μ_a − μ_b)),but the GP-posterior branch in
kompot/differential/differential_expression.py:629wascomputing
combined_cov = (cov1 + cov2) / 2. This change dropsthe
/2so that the combined covariance equals the variance ofthe difference of two independent posterior estimators. The
sample-variance branch at the same call site was already
correctly summed, which is what made the inconsistency visible
to begin with.
Effect:
Dshrinks by a factor of√2(andD²by 2)everywhere. Same scale factor for real and null, so relative
rankings and FDR call-sets are essentially unchanged after
recalibration.
DA posterior tail probability — one-sided
The manuscript prose describes a one-sided test ("the
probability that the true density difference has the opposite
sign to the estimated change"), and the formula
PTP(x_i) = Φ(−|Δ(x_i)| / sqrt(σ_a² + σ_b²))is one-sided.The implementation at
kompot/differential/differential_abundance.py:607-610wasemitting
2 · Φ(−|z|)(two-sided). This change removes thespurious
+ np.log(2)soln_ptp = log Φ(−|z|).Effect: numeric PTPs are halved;
neg_log10_fold_change_ptpincreases by
log10(2) ≈ 0.301. A previously-published cutoffof
PTP < 1e-3corresponded to|z| ≥ 3.29(two-sided) andnow corresponds to
|z| ≥ 3.09(one-sided). Re-tune anyptp_thresholdchosen against 0.7.0.use_empirical_variance—FalseeverywhereThe manuscript states that empirical variance is disabled by
default. This was already true for the recommended
kompot.de()path viaGPSettings(use_empirical_variance=False)and for thede_config_template.yaml. Four other publicly-exposed entrypoints were inconsistent (default
True):kompot.differential.DifferentialExpression.__init__kompot.differential.expression_model.ExpressionModel.__init__kompot.anndata.differential_expression.compute_differential_expression(deprecated wrapper)
kompot.anndata.smooth.compute_smoothed_expression(deprecated wrapper)
kompot.anndata.smooth.smooth_expression(gp=Nonefallback)
kompot/cli/templates/smooth_config_template.yamlAll now default to
Falseto match the recommended path andthe manuscript. Docstrings and the resource-estimation comment
for combined-covariance memory are brought in line. Code that
relied on empirical variance must now opt in explicitly via
GPSettings(use_empirical_variance=True)/ the keyword arg.0.7.0 → 0.8.0pyproject.tomlandkompot/version.pybumped to0.8.0;new
CHANGELOG.mdsection under## [0.8.0] - 2026-05-28records the three behavior changes above with migration notes.
No release tag is cut by this PR.
Tests
tests/test_audit_fixes.pyadds three focused regressionsuites that pin the post-0.8.0 invariants:
compute_mahalanobis_distancesto capture thecovariancekwarg from a controlled-kernel fit (
cov1 = 2I,cov2 = 3I)and asserts equality with
5 * I. The pre-0.8.0 code wouldhave produced
2.5 * I.ln_ptpformula and asserts itequals the closed-form
Φ(−|z|), both directly and through afull
DifferentialAbundance.fit().predict()round-trip.use_empirical_variancedefault: introspects every publicentry point's signature and asserts the default is
False;also parses both CLI YAML templates.
tests/test_empirical_variance.py::TestAnnDataEmpiricalVariance::test_default_is_onwas the only pre-existing test pinning the old buggy default;
it has been flipped and renamed
test_default_is_off.The reference-output generator (
tests/generate_reference_outputs.py)writes to
/tmp/kompot_reference/and is developer scratch —not checked in, not part of CI — so no committed snapshots
need regeneration. The Mahalanobis-approach tests
(
tests/test_mahalanobis_approaches.py) compare in-memory vsdisk-backed implementations against each other rather than
against pinned numeric values, so they are insensitive to this
scale shift.
Full local pytest run (kompot 0.8.0 venv): 524 passed +
1 updated pinning test + new audit-fix suite (10 tests, all
green).