package quantile

Import Path
	github.com/beorn7/perks/quantile (on go.dev)

Dependency Relation
	imports 2 packages, and imported by one package

Involved Source Files Package quantile computes approximate quantiles over an unbounded data stream within low memory and CPU bounds. A small amount of accuracy is traded to achieve the above properties. Multiple streams can be merged before calling Query to generate a single set of results. This is meaningful when the streams represent the same type of data. See Merge and Samples. For more detailed information about the algorithm used, see: Effective Computation of Biased Quantiles over Data Streams http://www.cs.rutgers.edu/~muthu/bquant.pdf
Code Examples package main import ( "fmt" "github.com/beorn7/perks/quantile" ) func main() { // Scenario: // We have multiple database shards. On each shard, there is a process // collecting query response times from the database logs and inserting // them into a Stream (created via NewTargeted(0.90)), much like the // Simple example. These processes expose a network interface for us to // ask them to serialize and send us the results of their // Stream.Samples so we may Merge and Query them. // // NOTES: // * These sample sets are small, allowing us to get them // across the network much faster than sending the entire list of data // points. // // * For this to work correctly, we must supply the same quantiles // a priori the process collecting the samples supplied to NewTargeted, // even if we do not plan to query them all here. ch := make(chan quantile.Samples) getDBQuerySamples(ch) q := quantile.NewTargeted(map[float64]float64{0.90: 0.001}) for samples := range ch { q.Merge(samples) } fmt.Println("perc90:", q.Query(0.90)) } // This is a stub for the above example. In reality this would hit the remote // servers via http or something like it. func getDBQuerySamples(ch chan quantile.Samples) {} package main import ( "bufio" "fmt" "github.com/beorn7/perks/quantile" "log" "os" "strconv" ) func main() { ch := make(chan float64) go sendFloats(ch) // Compute the 50th, 90th, and 99th percentile. q := quantile.NewTargeted(map[float64]float64{ 0.50: 0.005, 0.90: 0.001, 0.99: 0.0001, }) for v := range ch { q.Insert(v) } fmt.Println("perc50:", q.Query(0.50)) fmt.Println("perc90:", q.Query(0.90)) fmt.Println("perc99:", q.Query(0.99)) fmt.Println("count:", q.Count()) } func sendFloats(ch chan<- float64) { f, err := os.Open("exampledata.txt") if err != nil { log.Fatal(err) } sc := bufio.NewScanner(f) for sc.Scan() { b := sc.Bytes() v, err := strconv.ParseFloat(string(b), 64) if err != nil { log.Fatal(err) } ch <- v } if sc.Err() != nil { log.Fatal(sc.Err()) } close(ch) } package main import ( "github.com/beorn7/perks/quantile" "time" ) func main() { // Scenario: We want the 90th, 95th, and 99th percentiles for each // minute. ch := make(chan float64) go sendStreamValues(ch) tick := time.NewTicker(1 * time.Minute) q := quantile.NewTargeted(map[float64]float64{ 0.90: 0.001, 0.95: 0.0005, 0.99: 0.0001, }) for { select { case t := <-tick.C: flushToDB(t, q.Samples()) q.Reset() case v := <-ch: q.Insert(v) } } } func sendStreamValues(ch chan float64) { } func flushToDB(t time.Time, samples quantile.Samples) { }
Package-Level Type Names (total 6, in which 3 are exported)
/* sort exporteds by: | */
Sample holds an observed value and meta information for compression. JSON tags have been added for convenience. Delta float64 Value float64 Width float64
Samples represents a slice of samples. It implements sort.Interface. ( T) Len() int ( T) Less(i, j int) bool ( T) Swap(i, j int) T : sort.Interface func (*Stream).Samples() Samples func (*Stream).Merge(samples Samples)
Stream computes quantiles for a stream of float64s. It is not thread-safe by design. Take care when using across multiple goroutines. Count returns the total number of samples observed in the stream since initialization. Insert inserts v into the stream. Merge merges samples into the underlying streams samples. This is handy when merging multiple streams from separate threads, database shards, etc. ATTENTION: This method is broken and does not yield correct results. The underlying algorithm is not capable of merging streams correctly. Query returns the computed qth percentiles value. If s was created with NewTargeted, and q is not in the set of quantiles provided a priori, Query will return an unspecified result. Reset reinitializes and clears the list reusing the samples buffer memory. Samples returns stream samples held by s. func NewHighBiased(epsilon float64) *Stream func NewLowBiased(epsilon float64) *Stream func NewTargeted(targetMap map[float64]float64) *Stream
Package-Level Functions (total 5, in which 3 are exported)
NewHighBiased returns an initialized Stream for high-biased quantiles (e.g. 0.01, 0.1, 0.5) where the needed quantiles are not known a priori, but error guarantees can still be given even for the higher ranks of the data distribution. The provided epsilon is a relative error, i.e. the true quantile of a value returned by a query is guaranteed to be within 1-(1±Epsilon)*(1-Quantile). See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error properties.
NewLowBiased returns an initialized Stream for low-biased quantiles (e.g. 0.01, 0.1, 0.5) where the needed quantiles are not known a priori, but error guarantees can still be given even for the lower ranks of the data distribution. The provided epsilon is a relative error, i.e. the true quantile of a value returned by a query is guaranteed to be within (1±Epsilon)*Quantile. See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error properties.
NewTargeted returns an initialized Stream concerned with a particular set of quantile values that are supplied a priori. Knowing these a priori reduces space and computation time. The targets map maps the desired quantiles to their absolute errors, i.e. the true quantile of a value returned by a query is guaranteed to be within (Quantile±Epsilon). See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error properties.