package quantile
Import Path
github.com/beorn7/perks/quantile (on go.dev)
Dependency Relation
imports 2 packages, and imported by one package
Involved Source Files
Package quantile computes approximate quantiles over an unbounded data
stream within low memory and CPU bounds.
A small amount of accuracy is traded to achieve the above properties.
Multiple streams can be merged before calling Query to generate a single set
of results. This is meaningful when the streams represent the same type of
data. See Merge and Samples.
For more detailed information about the algorithm used, see:
Effective Computation of Biased Quantiles over Data Streams
http://www.cs.rutgers.edu/~muthu/bquant.pdf
Code Examples
package main
import (
"fmt"
"github.com/beorn7/perks/quantile"
)
func main() {
// Scenario:
// We have multiple database shards. On each shard, there is a process
// collecting query response times from the database logs and inserting
// them into a Stream (created via NewTargeted(0.90)), much like the
// Simple example. These processes expose a network interface for us to
// ask them to serialize and send us the results of their
// Stream.Samples so we may Merge and Query them.
//
// NOTES:
// * These sample sets are small, allowing us to get them
// across the network much faster than sending the entire list of data
// points.
//
// * For this to work correctly, we must supply the same quantiles
// a priori the process collecting the samples supplied to NewTargeted,
// even if we do not plan to query them all here.
ch := make(chan quantile.Samples)
getDBQuerySamples(ch)
q := quantile.NewTargeted(map[float64]float64{0.90: 0.001})
for samples := range ch {
q.Merge(samples)
}
fmt.Println("perc90:", q.Query(0.90))
}
// This is a stub for the above example. In reality this would hit the remote
// servers via http or something like it.
func getDBQuerySamples(ch chan quantile.Samples) {}
package main
import (
"bufio"
"fmt"
"github.com/beorn7/perks/quantile"
"log"
"os"
"strconv"
)
func main() {
ch := make(chan float64)
go sendFloats(ch)
// Compute the 50th, 90th, and 99th percentile.
q := quantile.NewTargeted(map[float64]float64{
0.50: 0.005,
0.90: 0.001,
0.99: 0.0001,
})
for v := range ch {
q.Insert(v)
}
fmt.Println("perc50:", q.Query(0.50))
fmt.Println("perc90:", q.Query(0.90))
fmt.Println("perc99:", q.Query(0.99))
fmt.Println("count:", q.Count())
}
func sendFloats(ch chan<- float64) {
f, err := os.Open("exampledata.txt")
if err != nil {
log.Fatal(err)
}
sc := bufio.NewScanner(f)
for sc.Scan() {
b := sc.Bytes()
v, err := strconv.ParseFloat(string(b), 64)
if err != nil {
log.Fatal(err)
}
ch <- v
}
if sc.Err() != nil {
log.Fatal(sc.Err())
}
close(ch)
}
package main
import (
"github.com/beorn7/perks/quantile"
"time"
)
func main() {
// Scenario: We want the 90th, 95th, and 99th percentiles for each
// minute.
ch := make(chan float64)
go sendStreamValues(ch)
tick := time.NewTicker(1 * time.Minute)
q := quantile.NewTargeted(map[float64]float64{
0.90: 0.001,
0.95: 0.0005,
0.99: 0.0001,
})
for {
select {
case t := <-tick.C:
flushToDB(t, q.Samples())
q.Reset()
case v := <-ch:
q.Insert(v)
}
}
}
func sendStreamValues(ch chan float64) {
}
func flushToDB(t time.Time, samples quantile.Samples) {
}
Package-Level Type Names (total 6, in which 3 are exported)
Sample holds an observed value and meta information for compression. JSON
tags have been added for convenience.
Delta float64
Value float64
Width float64
func (*Stream).insert(sample Sample)
Samples represents a slice of samples. It implements sort.Interface.
( T) Len() int
( T) Less(i, j int) bool
( T) Swap(i, j int)
T : sort.Interface
T : github.com/aws/aws-sdk-go/aws/corehandlers.lener
func (*Stream).Samples() Samples
func (*Stream).Merge(samples Samples)
Stream computes quantiles for a stream of float64s. It is not thread-safe by
design. Take care when using across multiple goroutines.
b Samples
sorted bool
stream *stream
stream.l []Sample
stream.n float64
stream.ƒ invariant
Count returns the total number of samples observed in the stream
since initialization.
Insert inserts v into the stream.
Merge merges samples into the underlying streams samples. This is handy when
merging multiple streams from separate threads, database shards, etc.
ATTENTION: This method is broken and does not yield correct results. The
underlying algorithm is not capable of merging streams correctly.
Query returns the computed qth percentiles value. If s was created with
NewTargeted, and q is not in the set of quantiles provided a priori, Query
will return an unspecified result.
Reset reinitializes and clears the list reusing the samples buffer memory.
Samples returns stream samples held by s.
( T) compress()
( T) count() int
(*T) flush()
(*T) flushed() bool
(*T) insert(sample Sample)
(*T) maybeSort()
( T) merge(samples Samples)
( T) query(q float64) float64
( T) reset()
( T) samples() Samples
func NewHighBiased(epsilon float64) *Stream
func NewLowBiased(epsilon float64) *Stream
func NewTargeted(targetMap map[float64]float64) *Stream
func newStream(ƒ invariant) *Stream
Package-Level Functions (total 5, in which 3 are exported)
NewHighBiased returns an initialized Stream for high-biased quantiles
(e.g. 0.01, 0.1, 0.5) where the needed quantiles are not known a priori, but
error guarantees can still be given even for the higher ranks of the data
distribution.
The provided epsilon is a relative error, i.e. the true quantile of a value
returned by a query is guaranteed to be within 1-(1±Epsilon)*(1-Quantile).
See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error
properties.
NewLowBiased returns an initialized Stream for low-biased quantiles
(e.g. 0.01, 0.1, 0.5) where the needed quantiles are not known a priori, but
error guarantees can still be given even for the lower ranks of the data
distribution.
The provided epsilon is a relative error, i.e. the true quantile of a value
returned by a query is guaranteed to be within (1±Epsilon)*Quantile.
See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error
properties.
NewTargeted returns an initialized Stream concerned with a particular set of
quantile values that are supplied a priori. Knowing these a priori reduces
space and computation time. The targets map maps the desired quantiles to
their absolute errors, i.e. the true quantile of a value returned by a query
is guaranteed to be within (Quantile±Epsilon).
See http://www.cs.rutgers.edu/~muthu/bquant.pdf for time, space, and error properties.
![]() |
The pages are generated with Golds v0.3.2-preview. (GOOS=darwin GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @Go100and1 (reachable from the left QR code) to get the latest news of Golds. |