Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f84bf33
Add coefficients support to DiscreteSumConstraint
Scienfitz Apr 29, 2026
13bfff3
Add simplex_coefficients to SubspaceDiscrete.from_simplex
Scienfitz Apr 29, 2026
fe89ab1
Add tests for DiscreteSumConstraint.coefficients and from_simplex sim…
Scienfitz Apr 29, 2026
09b8cb1
Switch DiscreteSumConstraint.get_invalid to column-by-column weighted…
Scienfitz Apr 29, 2026
87c6ccc
Use any() for non-negativity check in from_simplex
Scienfitz Apr 29, 2026
668adcf
Use pure numpy in from_simplex incremental construction loop
Scienfitz Apr 29, 2026
a33a999
Fix mypy error
Scienfitz Apr 29, 2026
c7ee2bd
Improve validation in `from_simplex`
Scienfitz May 7, 2026
4924247
Improve deserialization validation
Scienfitz May 7, 2026
cc1a509
Forbid all-zero coefficients in linear/sum constraints
Scienfitz Jun 3, 2026
23c36ed
Forbid individual zero coefficients in linear/sum constraints
Scienfitz Jun 3, 2026
968731c
Add tests for non-zero coefficient validation
Scienfitz Jun 3, 2026
a952e1e
Unify coefficient validation tests for both constraint types
Scienfitz Jun 10, 2026
9386330
Merge simplex coefficient tests into three-way comparison
Scienfitz Jun 10, 2026
6e696b2
Consolidate polars sum constraint tests
Scienfitz Jun 10, 2026
0e13272
Improve docstring
Scienfitz Jun 10, 2026
a818f6a
Use active settings dtype
Scienfitz Jul 2, 2026
9be7f37
Expand Raises section
Scienfitz Jul 2, 2026
12c2409
Use broadcasting to avoid materializing intermediate arrays in from_s…
Scienfitz Jul 2, 2026
a5a3ec4
Allow negative parameter values in from_simplex
Scienfitz Jul 2, 2026
65bb3d4
Update CHANGELOG
Scienfitz Jun 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,25 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
### Breaking Changes
- All optional arguments of `SubspaceDiscrete.from_simplex` after `simplex_parameters`
are now keyword-only

### Added
- `coefficients` attribute for `DiscreteSumConstraint`, enabling weighted sums. Follows
Comment thread
Scienfitz marked this conversation as resolved.
Comment thread
Scienfitz marked this conversation as resolved.
the same pattern as `ContinuousLinearConstraint.coefficients`
- `simplex_coefficients` keyword argument to `SubspaceDiscrete.from_simplex` for
weighted simplex sum constraints

### Changed
- `BOTORCH` GP preset now includes `BetaPrior(2.5, 1.5)` for the task covariance
kernel in multi-task scenarios, matching BoTorch's `MultiTaskGP` defaults introduced
in version `0.18.0`
- The `BOTORCH` GP preset now requires BoTorch `>= 0.18.0` and raises an
`IncompatibilityError` if an older version is installed
- `DiscreteSumConstraint`, `ContinuousLinearConstraint`, and
`SubspaceDiscrete.from_simplex` now forbid 0 as coefficients
- `SubspaceDiscrete.from_simplex` no longer requires non-negative parameter values

## [0.15.0] - 2026-06-11
### Breaking Changes
Expand Down
2 changes: 2 additions & 0 deletions baybe/constraints/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ def _validate_coefficients( # noqa: DOC101, DOC103
"The given 'coefficients' list must have one floating point entry for "
"each entry in 'parameters'."
)
if any(c == 0.0 for c in coefficients):
raise ValueError("All entries in 'coefficients' must be non-zero.")

@coefficients.default
def _default_coefficients(self) -> tuple[float, ...]:
Expand Down
53 changes: 48 additions & 5 deletions baybe/constraints/discrete.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,16 @@
from __future__ import annotations

import gc
from collections.abc import Callable
from collections.abc import Callable, Sequence
from functools import reduce
from typing import TYPE_CHECKING, Any, ClassVar, cast

import cattrs
import numpy as np
import numpy.typing as npt
import pandas as pd
from attrs import define, field
from attrs.validators import in_, min_len
from attrs.validators import deep_iterable, in_, min_len
from typing_extensions import override

from baybe.constraints.base import CardinalityConstraint, DiscreteConstraint
Expand All @@ -26,6 +27,7 @@
block_serialization_hook,
converter,
)
from baybe.utils.validation import finite_float

if TYPE_CHECKING:
import polars as pl
Expand Down Expand Up @@ -77,7 +79,11 @@ def get_invalid_polars(self) -> pl.Expr:

@define
class DiscreteSumConstraint(DiscreteConstraint):
"""Class for modelling sum constraints."""
"""Class for modelling sum constraints.

The constraint evaluates whether the (optionally weighted) sum of the specified
parameters satisfies the given threshold condition.
"""

# IMPROVE: refactor `SumConstraint` and `ProdConstraint` to avoid code copying

Expand All @@ -94,9 +100,45 @@ class DiscreteSumConstraint(DiscreteConstraint):
condition: ThresholdCondition = field()
"""The condition modeled by this constraint."""

coefficients: tuple[float, ...] = field(
converter=lambda x: cattrs.structure(x, tuple[float, ...]),
validator=deep_iterable(member_validator=finite_float),
)
"""The coefficients for the weighted sum, one per entry in ``parameters``.

Defaults to all-ones, i.e. an unweighted sum."""

@coefficients.default
def _default_coefficients(self) -> tuple[float, ...]:
"""Return equal weight coefficients as default."""
return (1.0,) * len(self.parameters)

@coefficients.validator
def _validate_coefficients( # noqa: DOC101, DOC103
self, _: Any, coefficients: Sequence[float]
) -> None:
"""Validate the coefficients.

Raises:
ValueError: If the number of coefficients does not match the number of
parameters.
"""
if len(self.parameters) != len(coefficients):
raise ValueError(
"The given 'coefficients' list must have one floating point entry for "
"each entry in 'parameters'."
)
if any(c == 0.0 for c in coefficients):
raise ValueError("All entries in 'coefficients' must be non-zero.")

@override
def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index:
evaluate_df = df[self.parameters].sum(axis=1)
evaluate_df = pd.Series(
sum(
df[p].to_numpy() * c for p, c in zip(self.parameters, self.coefficients)
),
index=df.index,
)
Comment on lines +136 to +141

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
evaluate_df = pd.Series(
sum(
df[p].to_numpy() * c for p, c in zip(self.parameters, self.coefficients)
),
index=df.index,
)
evaluate_df = df[self.parameters] @ self.coefficients

@Scienfitz Scienfitz Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you see this comment in the PR description
image

i am prioritizing not doing copy operations here at the cost of having to do several computations instead of one big vectorized one. in the limit of few parameters (generally the case for us) this should be the better choice

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to be sure I understand this right: you are saying that accessing all columns simultaneously could give a non-contiguous array and that the @ operation will thus result in copying it internally to perform the matrix product? If that's the case, I'm fine with the current version, but could you point me to some docs or similar so that I can read more about it?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaik df[self.parameters] will do a copy that is contiguous for the @ operation if the parameters are not referring to the contiguous stored data. Since we are always having a parameter subset here this will prob always happen -> I tried to avoid this

df[p] will only refer to one parameter and hence never do a copy, so unless there are a huge amount of parameters this will be better imo

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand this is only speculative? It could indeed be the case, but just as likely could it be the opposite. Numpy, for example, is where efficient with creating views based on slicing, with no copying involved. And since pandas uses numpy under the hood, there are chances that no copying is involved here, in which case the @-syntax would clearly win in terms of readability AND efficiency (avoiding the loop). So if you decide to deviate from it, then let's turn that speculation into certainty or at least empirical evidence?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're overestimating what even numpy can do, if you index an array with eg indices 0,3,7 it will always do a copy because that cannot be represented into numpys basepointer/shape/stride model or in other words: it is not a slice. So pandas just inherits from that. Cant spot any speculation here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdrianSosic in the absence of further conversation I will consider this resolved soon

mask_bad = ~self.condition.evaluate(evaluate_df)

return df.index[mask_bad]
Expand All @@ -105,7 +147,8 @@ def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index:
def get_invalid_polars(self) -> pl.Expr:
from baybe._optional.polars import polars as pl

return self.condition.to_polars(pl.sum_horizontal(self.parameters)).not_()
weighted = [pl.col(p) * c for p, c in zip(self.parameters, self.coefficients)]
return self.condition.to_polars(pl.sum_horizontal(weighted)).not_()


@define
Expand Down
Loading
Loading