math_bits

Header-only C++20 library for multiplying integers by a constant floating-point factor using integer bit-shifting — no FPU, no runtime division, compile-time unit tests.

Features

No floating-point at runtime — all FPU operations happen at compile time. The generated code is pure integer arithmetic.
Compile-time parameter generation — multiplier, bit-shift count, and integer scale factor are all derived at compile time from the floating-point input.
Overflow safe — the maximum multiplication factor is computed at compile time to guarantee no overflow for the given input range.
Configurable accuracy — max_error (in the options traits class) sets the allowed deviation from the true floating-point result. Defaults to ±1 LSB.
Compile-time unit tests — a static_assert runs a full test suite at compile time. A broken instantiation will not compile.
Header-only — single .h file, no dependencies beyond the C++ standard library.
Always-inlined hot path — mult() is unconditionally [[gnu::always_inline]], so the integer multiply-and-shift fuses into the caller with no extra flag.

Requirements

C++20 or later (std::bit_width is used for compile-time bit counting)
Any C++20 compiler (GCC, Clang, MSVC)
No hardware FPU required — designed for Cortex-M0/M0+ and other FPU-less targets

Usage

Basic

#include "math_bits.h"

// Multiply uint16_t values by 0.75, input range [0, 1000], default options
using scale75 = mult_bitshift<0.75, (uint16_t)1000, uint16_t, uint32_t>;

uint16_t result = scale75::mult(800);  // result ≈ 600

Operator overload

scale75 scaler;
uint16_t result = scaler * 800;  // same as scale75::mult(800)

Customizing options

The optional flags (max_error, deep_test, clamp_input) live in a traits-class struct. Derive from mult_bitshift_options and override only what you want — everything else stays at its default.

struct fast_safe : mult_bitshift_options {
    static constexpr bool deep_test   = false;   // skip deep compile-time sweep
    static constexpr bool clamp_input = true;    // clamp inputs > max_input_value
};
using scale75_safe = mult_bitshift<0.75, (uint16_t)1000, uint16_t, uint32_t, fast_safe>;

uint16_t result = scale75_safe::mult(2000); // returns mult(1000), not garbage

Option structs compose — derive from another option struct to extend it:

struct fast : mult_bitshift_options {
    static constexpr bool deep_test = false;
};
struct fast_with_clamp : fast {
    static constexpr bool clamp_input = true;
};

Backwards-compatible legacy form

The previous positional-argument signature is preserved as mult_bitshift_legacy. Existing call sites can keep working by renaming mult_bitshift → mult_bitshift_legacy:

// Same configuration as scale75_safe above, in the old positional form.
// The force_inlining parameter (position 6) is accepted for source
// compatibility but has no effect — mult() is now unconditionally
// always_inline.
using scale75_safe_legacy =
    mult_bitshift_legacy<0.75, (uint16_t)1000, uint16_t, uint32_t,
                         /*max_error*/1, /*force_inlining (unused)*/false,
                         /*deep_test*/false, /*clamp_input*/true>;

New code should prefer the traits-class form — the legacy form is kept only to avoid breaking existing instantiations. Note that the legacy form preserves the pre-refactor default deep_test=true, while the traits-class form's default is false; set it explicitly if the difference matters.

Template Parameters

mult_bitshift takes five template parameters: two required values, two type parameters with defaults, and a traits-class type carrying the optional flags.

Parameter	Default	Description
`multvalue`	—	Floating-point multiplier (`float`, `double`, or `long double`)
`max_input_value`	—	Maximum input value the multiplier must handle without overflow
`io_type`	`uint32_t`	Input and output integer type. Must be unsigned.
`calc_type`	`uint32_t`	Internal calculation type. Must be unsigned and at least as wide as `io_type`.
`Options`	`mult_bitshift_options`	Traits-class type carrying the optional flags below. Derive from `mult_bitshift_options` and override only the members you want.

`mult_bitshift_options` members

All members are static constexpr. Override only the ones you want by deriving a new struct:

Member	Type	Default	Description
`max_error`	`uint64_t`	`1`	Maximum allowed deviation from the true floating-point result (in LSB). Generalized to `uint64_t` so the struct doesn't depend on `io_type`; the class casts back to `io_type` internally. Must fit in `io_type`.
`deep_test`	`bool`	`false`	Default `false` runs a quick 100-sample smoke test at compile time — fast to build. Set `true` for the full sweep (up to 65535 inputs) when you want maximum assurance and can absorb the compile-time cost.
`clamp_input`	`bool`	`false`	If `true`, clamp inputs above `max_input_value` to `max_input_value` before multiplying — guarantees output stays within the `max_input_value * mult_factor` envelope. Adds ~5 instructions on the hot path. When `false`, the clamp disappears entirely (zero cost).

Legacy positional form: `mult_bitshift_legacy`

For backwards compatibility, the previous positional signature is preserved as a separate alias. The legacy form's defaults match the pre-refactor mult_bitshift defaults — notably deep_test=true (full 65535-sample sweep). The new traits-class form's deep_test default was deliberately changed to false for faster compiles; if you want the deep sweep, either use the legacy form or override deep_test=true in your options struct.

Position	Parameter	Default
1	`multvalue`	—
2	`max_input_value`	—
3	`io_type`	`uint32_t`
4	`calc_type`	`uint32_t`
5	`max_error`	`1`
6	`force_inlining` (no effect — kept for source compatibility)	`false`
7	`deep_test`	`true`
8	`clamp_input`	`false`

API Reference

Function	Description
`mult(input)`	Multiply `input` by the configured factor. Static — no instance needed.
`operator*(val)`	Instance operator overload — calls `mult(val)`.
`operator*(val, rhs)`	Friend operator overload — `val * scaler`.

Compile-time constants

Constant	Description
`mult_factor`	The original floating-point multiplier
`max_input_int`	The configured maximum input value
`bitShifts`	Number of bits shifted in the integer multiplication
`mult_factor_int`	The integer scale factor derived from `mult_factor`
`max_output_int`	Precomputed `mult(max_input_int)` — the largest value `mult()` will ever return
`max_error`	The configured `max_error` from `Options`, cast to `io_type`
`max_deviation`	Same as `max_error` — kept for backwards compatibility
`deep_test`	The configured `deep_test` flag from `Options`
`clamp_input`	The configured `clamp_input` flag from `Options`
`options`	The `Options` traits-class type itself, exposed for inspection

Design Notes

Why bit-shifting instead of floating-point? On Cortex-M0/M0+ there is no FPU. A floating-point multiply compiles to a software library call — slow, non-deterministic, and unsuitable for ISRs. By computing the scale factor at compile time and using a single integer multiply + shift at runtime, the hot path becomes 2–3 instructions with deterministic latency.

Why compile-time unit tests? The test suite verifies that every value in a representative sample of the input range produces a result within max_error of the true floating-point result. If the chosen max_error is too tight for the given multiplier and types, the build fails with a clear message — no separate test binary required. By default (deep_test=false) the sweep runs 100 samples — fast to compile and adequate for catching gross errors. Set deep_test=true for the full sweep (up to 65535 samples) when you want maximum assurance.

Why waste one extra type parameter for calc_type? The intermediate product input * mult_factor_int can overflow io_type. Using a wider calc_type (e.g. uint32_t when io_type is uint16_t) keeps the intermediate value safe and shifts back down to io_type at the end.

Performance (STM32G051, Cortex-M0+, `-Os`)

Verified by inspecting arm-none-eabi-g++ output for representative configurations:

mult() with calc_type ≤ uint32_t: hot path is muls + lsrs — one integer multiply, one bit-shift. A 1–2 instruction movs/lsls constant-load preamble brings the total to ~4 instructions on Cortex-M0+ (the constant load is an M0+ immediate-encoding limitation, not a library limitation).
mult() with calc_type = uint64_t: the multiply is widened, so the compiler emits a call to the integer runtime helper __aeabi_lmul — still no FPU, still deterministic, but no longer a single instruction. Prefer calc_type=uint32_t on Cortex-M0+ when your inputs and multiplier allow it.
No FPU instructions in either case — zero soft-float library calls at runtime.
Compile-time overhead: parameter generation and unit test run entirely at compile time — zero runtime cost.
clamp_input=true is fully inlined into the caller (~9 instructions on the common path) thanks to the unconditional [[gnu::always_inline]] on mult(). The clamp path uses an early return with a precomputed max_output_int.
[[gnu::flatten]] on user code is the strongest way to force transitive inlining at a specific call site — useful when calling mult() from a hot loop where you want every nested call (e.g. the operator overloads, or chained scalers) inlined alongside mult() itself.

License

This software is dual-licensed:

1. Open Source — GNU General Public License v3.0 (GPLv3): Free to use, modify, and distribute under the terms of the GNU General Public License version 3, as published by the Free Software Foundation. Note that GPLv3 is strong copyleft — derivative works and products that incorporate this software must also be released under GPLv3.

2. Commercial License: For use in proprietary or closed-source products that cannot or do not wish to comply with the GPLv3, a commercial license is available from PxQ Technologies — either as a written agreement, or via direct delivery by Erik Nørskov as part of a paid engagement (in which case the license is granted for that specific project scope only).

Each commercial license covers only the version of the software actually delivered into the licensee's project by the licensor. Later versions become covered only when likewise delivered as part of a paid engagement or written agreement, or when the licensee obtains a separate paid license for that later version. The licensee may not substitute or upgrade the software to any later version on their own initiative without such a license.

Contact: https://pxq.dk

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
LICENSE		LICENSE
README.md		README.md
math_bits.h		math_bits.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

math_bits

Features

Requirements

Usage

Basic

Operator overload

Customizing options

Backwards-compatible legacy form

Template Parameters

`mult_bitshift_options` members

Legacy positional form: `mult_bitshift_legacy`

API Reference

Compile-time constants

Design Notes

Performance (STM32G051, Cortex-M0+, `-Os`)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

math_bits

Features

Requirements

Usage

Basic

Operator overload

Customizing options

Backwards-compatible legacy form

Template Parameters

mult_bitshift_options members

Legacy positional form: mult_bitshift_legacy

API Reference

Compile-time constants

Design Notes

Performance (STM32G051, Cortex-M0+, -Os)

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`mult_bitshift_options` members

Legacy positional form: `mult_bitshift_legacy`

Performance (STM32G051, Cortex-M0+, `-Os`)

Packages