reducers
Reduction functions + Rust(rs), shortname rd.
Rust-backed reduction functions for NumPy arrays - plain (numpy-like) and NaN-aware. The functions I implemented are those listed in the numba documentation.
The target is
- much faster than numpy in many use cases,
- much faster than bottleneck in many use cases, and
- especially maximum performance for median and variance calculations, which are often bottlenecks in data processing pipelines.
Even though reducers is ≳2x slower than bottleneck for an n=30 1-D array (dominated by Rust overhead), it becomes ≳10 times faster than bottleneck for nD-combining larger arrays.
After Installation
Run the autotuner once on the machine where reducers will run:
python -m reducers.autotunerIt saves parallel-grain settings for that CPU and workload profile. Future import reducers calls apply those settings automatically. The built-in defaults are still valid; use python -m reducers.autotuner --reset to remove the saved tuning file and return to them.
First-Look API
import reducers as rd
# Plain: mean, min, max, ...
rd.mean(a) # include all values; NaN/inf propagate (np.mean)
# NaN-aware:
rd.nanmean(a) # skip NaN, keep inf (== np.nanmean behavior)
rd.nanmean(a, ignore_inf=True) # also drop +/-inf (finite-only)The same plain vs. nan* pattern applies to sum, min, max, minmax, median, std, var, percentile, quantile, and weighted average, plus the extras lmedian (lower value-selecting median) and count_finite.
Axis reductions cover the two layouts optimized by the Rust kernels:
rd.nanmedian(stack, axis=0) # reduce a stack shaped (N, H, W)
rd.nanmean(values, axis=-1) # reduce contiguous trailing-axis slices
rd.nanpercentile(stack, [16, 50, 84], axis=0)Maximum-performance Python API
The high-level API above is the default interface. For fixed hot loops where the caller already controls layout, import the low-level Python API as rdl:
import reducers.lowlevel as rdlrdl calls the same Rust kernels while skipping the high-level Python normalization layer. With the default copy=False, arrays are passed directly to the extension; they must already have the dimensionality, C-contiguity, and supported dtype expected by the called kernel. Use copy=True only when you explicitly want np.ascontiguousarray(...) at that call site.
buf = np.ascontiguousarray(a, dtype=np.float64)
rdl.mean_valid(buf) # trusted values; no NaN/inf filtering
rdl.mean_skip_nonfinite(buf) # skip NaN and +/-inf
rdl.var_mean_valid(buf, ddof=1) # paired result from one Rust reducerWeighted 1-D loops can call the fused weighted kernels without high-level return formatting. Choose the narrow primitive for the output terms needed:
weighted_sum = rdl.weighted_sum_only_skip_nonfinite(buf, w)
weighted_sum, sum_weights = rdl.weighted_sum_and_weights_skip_nonfinite(buf, w)
weighted_sum, sum_weights, unweighted_sum = rdl.weighted_sum_skip_nonfinite(buf, w)
average = rdl.weighted_average_skip_nonfinite(buf, w)The skip policy applies to values in buf; weights attached to retained values are used as-is. The bare weighted kernels expect contiguous same-length 1-D buffers; make any required copies before the call.
For stack-style axis reductions, normalize to the 2-D layout the Rust axis kernel expects:
stack2 = np.ascontiguousarray(stack.reshape(stack.shape[0], -1))
median_image = rdl.reduce_axis0_valid(stack2, "median").reshape(stack.shape[1:])For reusable per-output scratch buffers, in-place order statistics avoid an extra copy and may reorder the buffer:
scratch = np.ascontiguousarray(values, dtype=np.float64)
median = rdl.median_valid_in_place(scratch)Some reducers can return intermediate quantities that are already computed by the same Rust scan:
std, mean = rd.nanstd(a, ddof=1, return_mean=True)
weighted_sum, sum_of_weights = rd.nansum(
a, weights=w, return_sum_weights=True
)
weighted_sum, unweighted_sum = rd.nansum(
a, weights=w, return_unweighted_sum=True
)
weighted_sum, unweighted_sum, sum_of_weights = rd.nansum(
a, weights=w, return_unweighted_sum=True, return_sum_weights=True
)You may use:
try:
import reducers as rd
mean = rd.mean
except ImportError:
import numpy as np
mean = np.meanto completely replace numpy/bottleneck reductions with reducers in your code for the available reduction functions.
Semantics
One additional parameter is ignore_inf for nan* functions:
| NaN | +/-inf | |
|---|---|---|
mean / median / … (plain) |
propagate | propagate (IEEE) |
nanmean / nanmedian / … |
skip (np.nan* parity) |
keep |
nan*(..., ignore_inf=True) |
skip | skip (finite-only) |
API shape (numpy-like subset)
mean(a, axis=None, *, validate=True)
nanmean(a, axis=None, *, ignore_inf=False, validate=True)
average(a, weights=None, axis=None, *, validate=True)
nanaverage(a, weights=None, axis=None, *, ignore_inf=False, validate=True)
sum(a, axis=None, *, weights=None, return_sum_weights=False,
return_unweighted_sum=False, validate=True)
nansum(a, axis=None, *, weights=None, return_sum_weights=False,
return_unweighted_sum=False, ignore_inf=False, validate=True)
var(a, axis=None, ddof=0, *, return_mean=False, validate=True)
std(a, axis=None, ddof=0, *, return_mean=False, validate=True)
minmax(a, axis=None, *, validate=True)
nanminmax(a, axis=None, *, ignore_inf=False, validate=True)
percentile(a, q, axis=None, *, validate=True) # q in [0, 100], linear interpImportant notes:
axismay beNone(default, whole-array),0or-1(identical toa.ndim - 1); other axes raiseNotImplementedError. This keeps hidden transpose/copy costs out of the API and lets the Rust kernels specialize for the supported layouts.validate=Falseskips input prep for trusted hot loops where the caller already has a contiguous supported kernel dtype (float32,float64, bool, or a NumPy integer dtype). Integer and bool arrays are reduced directly without an up-front float copy; complex and object arrays are unsupported.- Integer and bool
min,nanmin,max,nanmax, andlmedianpreserve dtype (not converted to float).mean,sum,var/std,median, and percentiles still return floating results. minmaxis the fused plain endpoint reducer foraxis=None;nanminmaxis the fused NaN-skipping endpoint reducer foraxis=None. Axis calls currently return two separate axis reductions.- For
[nan]varand[nan]std,return_mean=Truereturns the already-computed mean alongside the variance or standard deviation. - For weighted
[nan]sum,return_sum_weights=Trueandreturn_unweighted_sum=Trueexpose quantities already available during the fused weighted scan; they requireweights=.... - Not a literal drop-in: no
out,keepdims,where,dtype, or percentilemethod(linear only). - Weighted averages support
weights=None, weights with the same shape asa, and 1-D weights along supported axes. A zero sum of retained weights raisesZeroDivisionError.
See Performance for the kernel techniques and benchmarks.
Rust Crate Use
reducers is also a Rust crate. The kernel modules do not depend on PyO3 or NumPy unless the Python extension feature is enabled.
[dependencies]
reducers = "<version>"use reducers::{reducers_1d, ScanPolicy};
let values = [1.0_f64, 2.0, f64::NAN, 4.0];
assert!(reducers_1d::mean(&values, ScanPolicy::AllValues).is_nan());
assert_eq!(reducers_1d::mean(&values, ScanPolicy::SkipNan), 7.0 / 3.0);The main Rust entry points are:
| module | use case |
|---|---|
reducers_1d |
Whole-slice reducers such as mean, nanmean-style scans, order statistics, weighted averages, and integer/bool kernels. |
axis |
Normalized 2-D axis kernels used by the Python layer; useful when the caller already controls layout. |
finite::ScanPolicy |
Shared scan policy: plain all-values, trusted finite, skip-NaN, or finite-only. |
parallel |
Runtime thread and grain controls. |
Build local Rust API docs with:
cargo doc --no-default-features --openThe published Rust API reference is also available on docs.rs.