Source Edit

Binary Indexed Sum Tree (BIST); Fenwick proposed "BIT" but that A) collides w/many uses B) takes partial (S)ums as implied, but explicit is better (though products can work) and C) does not rhyme with "dist" (for distribution - what it is mostly about). While Inet has tutorials, to my knowledge no one (yet) collects all these algos in one place. Fenwick1994 itself messed up invCDF, correcting w/a tech report a year later. This code only allocates needed space & uses 0-based array indexing. See https://en.wikipedia.org/wiki/Fenwick_tree

The idea of a standard binary heap with kids(k)@[2k],[2k+1] for dynamic distributions goes back to Wong&Easton 1980 (or earlier?). Fenwick's clever index encoding/overlaid trees idea allows using 1/4 to 1/2 that space (only max index+1 array elements vs 2*lgCeil(n)), a constant factor improvement. Good explanations truly need figures, as in the original Fenwick paper | Wikipedia.

The Bist[T] type in this module is generic over the type of counters used for partial sums|counts. For few total items, you can use a Bist[uint8] while for many you want to use Bist[uint32]. This can be space-optimized up to 2X further with adix/sequint specialized to store an array of B-bit counters. Ranked B-trees are faster for >24..28-bit index spaces as L3 CPU caching fails, but needing >7..8 decimal dynamic ranges is also rare.

Imports

bitop

Types

Bist[T] = object tot*: T data*: seq[T]: A razor thin wrapper around seq[T] Source Edit

Procs

proc `$`[T](t: Bist[T]): string: Source Edit
proc `[]`[T](t: Bist[T]; i: int): T: Source Edit
proc `[]`[T](t: var Bist[T]; i: int): var T: Source Edit
proc `[]=`[T](t: var Bist[T]; i: int; x: T): Source Edit
proc cdf[T](t: Bist[T]; i: int): T: INCLUSIVE sum(pmf[0..i]), (rank,EDF,prefix sum,scan,..); Tm~1 bits in i Source Edit
proc clear[T](t: var Bist[T]): Source Edit
proc count[T](t: Bist[T]): T: Source Edit
proc dec[T](t: var Bist[T]; i: int; d: T): Adjust for count decrement by d; Tm ~ 1/2..3/4 lg n Source Edit
proc fromCnts[T](t: var Bist[T]): In-place bulk convert/reformat t[] from counts to BIST; Max time ~1*n. Source Edit
proc inc[T](t: var Bist[T]; i: int; d: T): Adjust for count increment by d; Tm ~ 1/2..3/4 lg n Source Edit
proc init[T](t: var Bist[T]; len: int): Source Edit
proc initBist[T](len: int): Bist[T]: Source Edit
proc invCDF[T](t: Bist[T]; s: T): (int, T): For 0 < s <= tot return (i0,s0) so sum(..<i0)=s0 < s and sum(..i0)>=s Source Edit
proc invCDF[T](t: Bist[T]; s: T; s0, s1: var T): int: For 0 < s <= tot, find i0,s0,s1 so s0 < s <= s1 and s0+pmf(i0)==s1. Source Edit
proc invCDF[T](t: Bist[T]; s: T; s0: var T): int: For 0 < s <= tot, bracket ECDF jump >= s. I.e. find i0, s0 so s0 = sum(..< i0) < s yet sum(..i0) >= s in lgCeil n array probes. Source Edit
proc len[T](t: Bist[T]): int: Source Edit
proc max[T](t: Bist[T]): int: Simple wrapper: invCDF(t,t.count). Source Edit
proc min[T](t: Bist[T]): int: Simple wrapper: invCDF(t, 1) Source Edit
proc nCDF[T](t: Bist[T]): seq[float32]: Return classic CDF from read-only BIST Source Edit
proc nPDF[T](t: Bist[T]): seq[float32]: Return classic PMF from read-only BIST Source Edit
proc pmf[T](t: Bist[T]; i: int): T: Probability Mass Function @i; Avg Tm ~ 2 probes; Max Tm ~ lg n Source Edit
proc quantile[T](t: Bist[T]; q: float): float: Parzen-interpolated quantile when no caller index mapping is needed Source Edit
proc quantile[T](t: Bist[T]; q: float; iL, iH: var int): float: Parzen-interpolated quantile; E.g., q=0.9 => 90th percentile. answer = result*iL + (1-result)*iH, but is left to caller to do { in case it is mapping larger numeric ranges to/from iL,iH }. Tm ~ 2*lg(addrSpace). Unlike other (broken!) quantile-interpolation methods, Parzen's connects midpoints of vertical CDF jumps, not horizontal. This makes more sense, corresponding to Wilcoxon 1945 & later tie mid-ranking recommendations. Source Edit
func space[T](t: Bist[T]): int: Source Edit
proc toCnts[T](t: var Bist[T]): In-place bulk convert/reformat t[] from BIST to counts; Max time ~1n *Unlike the others, this routine only works for power of 2-sized arrays. Source Edit
proc up[T](t: var Bist[T]): Simple no-op for BISTs Source Edit

adix/bist

Imports

Types

Procs