Add Phase 2 multi-horizon event study for dCDH estimator#294
Conversation
…uille
Implements ROADMAP items 2a-2h: multi-horizon DID_l via per-group
DID_{g,l} building block (Eq 3 of dynamic paper), per-horizon
analytical SE, dynamic placebos DID^{pl}_l, normalized DID^n_l,
cost-benefit aggregate delta, sup-t simultaneous confidence bands,
plot_event_study() integration, and R DIDmultiplegtDYN parity tests
at multiple horizons. L_max parameter controls multi-horizon mode;
L_max=None preserves exact Phase 1 behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Overall Assessment ⛔ Blocker Executive Summary
Methodology
Code Quality
Performance No findings. Maintainability No additional findings beyond the public-surface contract issue above. Tech Debt No relevant Security No findings. Documentation/Tests
Path to Approval
|
…onsistency
- Fix cost-benefit delta to use cumulative dose (sum_{k=0}^{l-1} |D_{g,F_g+k} - D_{g,1}|)
instead of one-period dose; binary weights now proportional to l * N_l
- Flip dynamic placebo sign to ref-minus-preperiod (Y_{ref} - Y_{backward}),
matching the Phase 1 convention
- Include l=1 in sup-t bootstrap calibration so bands are truly simultaneous
over all horizons 1..L_max
- Use per-group DID_{g,l} path for event_study_effects[1] when L_max >= 2,
making all horizons use a consistent estimand
- Label overall_att as "delta" in summary/to_dataframe when L_max > 1
- Add A11 control-availability warnings for multi-horizon empty control pools
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ⛔ Blocker Executive Summary
Methodology
Re-review note: the prior sup-t calibration finding appears fixed. The bootstrap now calibrates over horizons Code Quality
Performance No findings. Maintainability No additional findings beyond the result-surface contract issue above. Tech Debt No mitigating entry in TODO.md:L51-L97 covers the P0/P1 issues above, so they remain unmitigated under the project’s deferral rules. Security No findings. Documentation/Tests
Path to Approval
|
…s, labels - Flip placebo to paper convention (backward - ref) for R parity - Revert cost-benefit delta to per-period dose (Lemma 4: w_l proportional to N_l for binary, not l * N_l) - Add placebo control-availability warnings mirroring DID path - Update __repr__ and joiners_leavers labels for delta when L_max > 1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…mary gate - Compute cost-benefit delta SE from per-horizon SEs via delta method: SE(delta) = sqrt(sum w_l^2 * SE(DID_l)^2), giving overall_att non-NaN inference when L_max > 1 - Document placebo SE NaN as intentional Phase 2 deferral in REGISTRY.md (placebo IF computation deferred; point estimates meaningful for visual pre-trends; bootstrap plumbing exists but not wired) - Gate summary() bootstrap note on non-NaN overall inference - Remove unused import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology Affected methods:
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…l Note - Compute delta-method SE regardless of bootstrap (was gated on bootstrap_results != None, leaving analytical path with NaN) - Generate one shared bootstrap weight matrix for all horizons so sup-t bands are a valid joint multiplier-bootstrap band - Add REGISTRY Note for Phase 2 equal-cell weighting deviation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology Affected methods: multi-horizon
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…taining When a switcher's control pool is empty at a given horizon (e.g., due to terminal missingness), exclude the group from N_l / N_pl_l rather than zero-retaining it. Zero-retention biases DID_l toward zero on ragged panels. Reserve zero-contribution for the IF path only (where it naturally has no effect). Same fix applied to _compute_multi_horizon_placebos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology Affected methods: multi-horizon
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…delta contracts
- IF/SE/bootstrap paths now use the same combined eligibility mask as
_compute_multi_horizon_dids (singleton-baseline + empty-control-pool
exclusions), so point estimate and inference agree on terminal-missing
panels
- Add REGISTRY Note documenting that event_study_effects[1] uses per-group
DID_{g,1} (cohort-based controls) when L_max >= 2, which may differ
from Phase 1 DID_M (period-based controls) on mixed-direction panels
- Add REGISTRY Note documenting that delta SE uses delta-method
(normal-theory) even when bootstrap is enabled, as an intentional
exception to the bootstrap-inference-surface contract
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ⛔ Blocker Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
- Delta SE now requires ALL positively-weighted horizons to have finite SE; if any has NaN, overall_se/p/CI are all NaN (NaN-consistent inference contract) - IF/SE/bootstrap paths use full variance-eligible group set (singleton- baseline filter only), not the switcher-only did_eligible mask. Never- switchers and later-switching controls with non-zero IF mass from their control roles are now included, matching the Phase 1 IF contract. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology No unmitigated P0/P1 findings. The previously reported IF-scope and Code Quality
Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
|
- Fix multi-horizon control-availability warnings to say "excluded from N_l" (not "zeroed and retained") - Update README overall_att description to reflect delta when L_max > 1 - Update README placebo Note to reflect current state (not "Phase 2 will add") - Update ROADMAP 2c and 2h status to reflect placebo SE and parity SE/placebo deferrals Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ⛔ Blocker The re-review found one unmitigated P0 in the new Phase 2 dynamic placebo path and one unmitigated P1 in the new Phase 2 top-level Executive Summary
Methodology Affected methods:
Code Quality No findings. Performance No findings. Maintainability No findings. Tech Debt
Security No findings. Documentation/Tests
Path to Approval
|
…lback
- Add N_mat[ctrl, forward_idx] > 0 to placebo control mask so
terminally missing controls don't leak into DID^{pl}_l
- When delta is NaN (non-estimable), set all overall_* to NaN
instead of silently falling back to the Phase 1 DID_M values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
- summary() now describes delta SE as delta-method (normal-theory) when L_max >= 2 with bootstrap, instead of claiming percentile - Update README, choosing_estimator.rst, REGISTRY.md to reflect shipped Phase 2 state (was: "Phase 2 will add") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
DID_lforl = 1..L_maxtoChaisemartinDHaultfoeuillevia theL_maxparameter (ROADMAP items 2a-2h)DID_{g,l}building block (Eq 3 of dynamic companion paper, NBER WP 29873)DID^{pl}_lwith dual eligibility condition (Section 1.1)DID^n_l = DID_l / delta^D_l(Section 3.2)delta(Section 3.3, Lemma 4) - becomesoverall_attwhenL_max > 1plot_event_study()integration with<50%switcher warningto_dataframe("event_study")andto_dataframe("normalized")output levelsevent_study_effectsDIDmultiplegtDYNparity tests at multiple horizons (4 new scenarios)L_max=None(default) preserves exact Phase 1 behavior (142 tests, 0 failures)Methodology references (required if estimator / math changes)
ChaisemartinDHaultfoeuille(DCDH) - multi-horizonDID_l, normalizedDID^n_l, cost-benefitdelta, dynamic placebosDID^{pl}_lValidation
tests/test_chaisemartin_dhaultfoeuille.py(+18 Phase 2 tests),tests/test_chaisemartin_dhaultfoeuille_parity.py(+4 multi-horizon R parity tests)DIDmultiplegtDYNv2.3.3benchmarks/data/dcdh_dynr_golden_values.jsonSecurity / privacy
Generated with Claude Code