1-D Array Benchmarks
imcombiners.kernels exposes _1d functions for generic value vectors as described in 1-D tutorial. These are meant to be fast for typical 1-D workloads in astronomy, and may replace NumPy or Bottleneck for operation-specific hot paths.
The _1d rejection APIs call direct Rust slice entry points. Those entry points and the image-stack kernels share the same per-output-vector decision logic, so 1-D calls avoid (N, 1, 1) packaging overhead without maintaining a separate rejection algorithm.
Run the benchmark locally with:
uv run --extra bench python benchmarks/benchmark_1d.pyThe benchmark reports one row per (length, dtype, operation). np, bn, and imc are median elapsed microseconds for NumPy, Bottleneck, and imcombiners. np/imc and bn/imc are speedup factors for imc: values above 1 mean imc is faster than that comparison path, and values below 1 mean imc is slower. float64 is the more typical general NumPy workload; float32 remains important for image stacks and CCD pipelines.
Every generated benchmark vector contains NaNs. The compared functions are the NaN-aware variants: np.nan*, bn.nan*, and the _1d imcombiners kernels. Timing columns use {:.2f} microseconds. Speedup columns use {:.1f}x.
| length | dtype | op | np (us) | bn (us) | imc (us) | np/imc | bn/imc |
|---|---|---|---|---|---|---|---|
| 10^2 | float64 | mean | 4.04 | 0.15 | 0.14 | 29.1x | 1.1x |
| 10^2 | float64 | median | 6.27 | 0.20 | 0.31 | 20.1x | 0.7x |
| 10^2 | float64 | sum | 1.69 | 0.15 | 0.14 | 12.0x | 1.0x |
| 10^2 | float64 | min | 1.51 | 0.09 | 0.13 | 11.5x | 0.7x |
| 10^2 | float64 | max | 1.49 | 0.09 | 0.13 | 11.7x | 0.7x |
| 10^2 | float64 | var | 8.98 | 0.29 | 0.17 | 52.0x | 1.7x |
| 10^4 | float64 | mean | 11.24 | 11.14 | 2.88 | 3.9x | 3.9x |
| 10^4 | float64 | median | 30.60 | 14.51 | 15.94 | 1.9x | 0.9x |
| 10^4 | float64 | sum | 6.81 | 12.53 | 2.99 | 2.3x | 4.2x |
| 10^4 | float64 | min | 2.11 | 5.05 | 1.33 | 1.6x | 3.8x |
| 10^4 | float64 | max | 2.21 | 5.17 | 1.39 | 1.6x | 3.7x |
| 10^4 | float64 | var | 26.82 | 24.73 | 3.52 | 7.6x | 7.0x |
| 10^7 | float64 | mean | 6316.96 | 11306.96 | 2848.58 | 2.2x | 4.0x |
| 10^7 | float64 | median | 56071.50 | 52099.04 | 17486.00 | 3.2x | 3.0x |
| 10^7 | float64 | sum | 4798.46 | 11374.21 | 2852.79 | 1.7x | 4.0x |
| 10^7 | float64 | min | 840.62 | 4499.88 | 368.04 | 2.3x | 12.2x |
| 10^7 | float64 | max | 856.88 | 4594.58 | 357.67 | 2.4x | 12.8x |
| 10^7 | float64 | var | 16004.79 | 24882.92 | 3500.87 | 4.6x | 7.1x |
| 10^2 | float32 | mean | 5.08 | 0.15 | 0.12 | 43.5x | 1.2x |
| 10^2 | float32 | median | 7.20 | 0.21 | 0.29 | 24.6x | 0.7x |
| 10^2 | float32 | sum | 1.70 | 0.15 | 0.12 | 14.3x | 1.2x |
| 10^2 | float32 | min | 1.51 | 0.08 | 0.10 | 14.8x | 0.8x |
| 10^2 | float32 | max | 1.47 | 0.08 | 0.11 | 14.0x | 0.7x |
| 10^2 | float32 | var | 9.93 | 0.28 | 0.14 | 68.9x | 2.0x |
| 10^4 | float32 | mean | 11.76 | 11.25 | 2.91 | 4.0x | 3.9x |
| 10^4 | float32 | median | 31.05 | 14.63 | 15.84 | 2.0x | 0.9x |
| 10^4 | float32 | sum | 5.52 | 11.36 | 2.86 | 1.9x | 4.0x |
| 10^4 | float32 | min | 1.73 | 4.57 | 1.19 | 1.5x | 3.8x |
| 10^4 | float32 | max | 1.68 | 4.47 | 1.19 | 1.4x | 3.8x |
| 10^4 | float32 | var | 24.93 | 24.82 | 2.89 | 8.6x | 8.6x |
| 10^7 | float32 | mean | 5466.42 | 12166.38 | 2999.17 | 1.8x | 4.1x |
| 10^7 | float32 | median | 68780.25 | 67508.33 | 14898.29 | 4.6x | 4.5x |
| 10^7 | float32 | sum | 3471.17 | 11381.50 | 2854.54 | 1.2x | 4.0x |
| 10^7 | float32 | min | 410.62 | 4545.87 | 220.29 | 1.9x | 20.6x |
| 10^7 | float32 | max | 449.17 | 4585.04 | 194.96 | 2.3x | 23.5x |
| 10^7 | float32 | var | 13546.38 | 24994.04 | 2873.79 | 4.7x | 8.7x |
imc is generally very fast, and often faster than both NumPy and Bottleneck. Thus, if the user has already loaded imc, less reason to import Bottleneck for faster NaN-aware reductions.
Fast variance calculation implies why imc is optimal for sigma-clip style rejections.