1-D Array Benchmarks

imcombiners.kernels exposes _1d functions for generic value vectors as described in 1-D tutorial. These are meant to be fast for typical 1-D workloads in astronomy, and may replace NumPy or Bottleneck for operation-specific hot paths.

The _1d rejection APIs call direct Rust slice entry points. Those entry points and the image-stack kernels share the same per-output-vector decision logic, so 1-D calls avoid (N, 1, 1) packaging overhead without maintaining a separate rejection algorithm.

Run the benchmark locally with:

uv run --extra bench python benchmarks/benchmark_1d.py

The benchmark reports one row per (length, dtype, operation). np, bn, and imc are median elapsed microseconds for NumPy, Bottleneck, and imcombiners. np/imc and bn/imc are speedup factors for imc: values above 1 mean imc is faster than that comparison path, and values below 1 mean imc is slower. float64 is the more typical general NumPy workload; float32 remains important for image stacks and CCD pipelines.

Every generated benchmark vector contains NaNs. The compared functions are the NaN-aware variants: np.nan*, bn.nan*, and the _1d imcombiners kernels. Timing columns use {:.2f} microseconds. Speedup columns use {:.1f}x.

length dtype op np (us) bn (us) imc (us) np/imc bn/imc
10^2 float64 mean 4.04 0.15 0.14 29.1x 1.1x
10^2 float64 median 6.27 0.20 0.31 20.1x 0.7x
10^2 float64 sum 1.69 0.15 0.14 12.0x 1.0x
10^2 float64 min 1.51 0.09 0.13 11.5x 0.7x
10^2 float64 max 1.49 0.09 0.13 11.7x 0.7x
10^2 float64 var 8.98 0.29 0.17 52.0x 1.7x
10^4 float64 mean 11.24 11.14 2.88 3.9x 3.9x
10^4 float64 median 30.60 14.51 15.94 1.9x 0.9x
10^4 float64 sum 6.81 12.53 2.99 2.3x 4.2x
10^4 float64 min 2.11 5.05 1.33 1.6x 3.8x
10^4 float64 max 2.21 5.17 1.39 1.6x 3.7x
10^4 float64 var 26.82 24.73 3.52 7.6x 7.0x
10^7 float64 mean 6316.96 11306.96 2848.58 2.2x 4.0x
10^7 float64 median 56071.50 52099.04 17486.00 3.2x 3.0x
10^7 float64 sum 4798.46 11374.21 2852.79 1.7x 4.0x
10^7 float64 min 840.62 4499.88 368.04 2.3x 12.2x
10^7 float64 max 856.88 4594.58 357.67 2.4x 12.8x
10^7 float64 var 16004.79 24882.92 3500.87 4.6x 7.1x
10^2 float32 mean 5.08 0.15 0.12 43.5x 1.2x
10^2 float32 median 7.20 0.21 0.29 24.6x 0.7x
10^2 float32 sum 1.70 0.15 0.12 14.3x 1.2x
10^2 float32 min 1.51 0.08 0.10 14.8x 0.8x
10^2 float32 max 1.47 0.08 0.11 14.0x 0.7x
10^2 float32 var 9.93 0.28 0.14 68.9x 2.0x
10^4 float32 mean 11.76 11.25 2.91 4.0x 3.9x
10^4 float32 median 31.05 14.63 15.84 2.0x 0.9x
10^4 float32 sum 5.52 11.36 2.86 1.9x 4.0x
10^4 float32 min 1.73 4.57 1.19 1.5x 3.8x
10^4 float32 max 1.68 4.47 1.19 1.4x 3.8x
10^4 float32 var 24.93 24.82 2.89 8.6x 8.6x
10^7 float32 mean 5466.42 12166.38 2999.17 1.8x 4.1x
10^7 float32 median 68780.25 67508.33 14898.29 4.6x 4.5x
10^7 float32 sum 3471.17 11381.50 2854.54 1.2x 4.0x
10^7 float32 min 410.62 4545.87 220.29 1.9x 20.6x
10^7 float32 max 449.17 4585.04 194.96 2.3x 23.5x
10^7 float32 var 13546.38 24994.04 2873.79 4.7x 8.7x

imc is generally very fast, and often faster than both NumPy and Bottleneck. Thus, if the user has already loaded imc, less reason to import Bottleneck for faster NaN-aware reductions.

Fast variance calculation implies why imc is optimal for sigma-clip style rejections.