The min-count sketch (NOT count-min) idea is to see hash(x) as a U(0,1) & use P(sampleMax<x)=x^n for sample size n. Low art inverts a confidence.ival for h.max to estimate n. Tracking k-most distinct h gives better accuracy and is usually called a KMV sketch. (Intuition is that k-th edge val => average gap between k-1 uniques&averaging cuts noise.) See Bar-Yossef 2002 "Counting Distinct..", Giroire05 "Order statistics & estimating cardinalities" & Ting14 "Streamed approximate counting..".
Procs
proc initUniqCe[F: SomeFloat](k = 1024): UniqCe[F]
- Return initialized UniqCe with tail size k. k=1024 costs 4K|1VM page Source Edit
proc nUniqueErr[F: SomeFloat](uc: UniqCe[F]): float32
- Estimated error on estimate of unique elements seen so far. Source Edit