Log
-
Make serialization compatible with the java version 💬 by Caio 10 years ago
This patch adjusts how deltas are encoded so it behaves exactly the same as the reference implementation. A test is added to make sure this doesn't break in the future.
-
Extract estimateCapacity function 💬 by Caio 10 years ago
The ideal situation is that we _never_ have to grow the slice, 20 is pretty much a very high upper bound which should only happen for very low compression numbers or pathological insertions. I've used a very scienfic approach to come up with the number 10 as the capacity estimator multiplier (read: I put a panic() when the slice had to grow and replayed latency numbers from random places. Started with 5 and went up until it stopped panic()ing then added one. A.K.A: not scientific at all. ymmv).
-
Stabilize the uniform distribution 💬 by Caio 10 years ago
I'm not really confident this is a good idea, but for now I'm really tired of the random failures.
-
Fix complaints from `go vet` 💬 by Caio 10 years ago
Sigh.
-
Exploit the slice for ceilingAndFloorItems and successorAndPredecessorItems 💬 by Caio 10 years ago
Now we're talking: > go test -v ./... -run XXX -bench . > PASS > BenchmarkAdd1-4 3000000 431 ns/op > BenchmarkAdd10-4 1000000 1860 ns/op > BenchmarkAdd100-4 300000 8047 ns/op > ok github.com/caio/go-tdigest 6.151s
-
Use a sorted slice instead of a tree for centroids 💬 by Caio 10 years ago
Got massive improvements already simply by glueing the sortedSlice struct without exploiting its advantages (ceil/floor and surroundings are cheaper operations, relatively speaking) > $ go test -v ./... -run XXX -bench . > PASS > BenchmarkAdd1-4 3000000 511 ns/op > BenchmarkAdd10-4 1000000 4304 ns/op > BenchmarkAdd100-4 100000 22212 ns/op > ok github.com/caio/go-tdigest 8.875s To be done: * Adapt the code to properly use the data structure * Estimate properly the initial capacity * Maybe get rid of the `interface{}` abstraction if I care enough (so instead of TDigest.summary.tree use TDigest.tree directly, for example). This is probably not worth the effort, performance-wise. -
Adjust test naming/phrasing to the new api by Caio 10 years ago
-
Rename `Update()` to `Add()` 💬 by Caio 10 years ago
Trying to keep the interface closer to the reference implementation. Arguably it's a better name anyway since we're adding a new datapoint to the summary.
-
Use the result from Delete() to validate updateCentroid calls 💬 by Caio 10 years ago
Tiny optimization: skip one Find() call per update.
-
Brainfart: centroids can have value of zero by Caio 10 years ago
-
Make sure numCentroids value is sane 💬 by Damian Gryski 10 years ago
Fixes a crash if numCentroids was negative, and potential out-of-memory condition if we try to allocate to huge a digest. And 4 million centroids ought to be enough for anybody... Found by fuzzing.
-
Add a few more internals tests 💬 by Caio 10 years ago
Golfing test coverage is not really fun.
-
Add godoc.org flair to the README by Caio 10 years ago
-
Make the usage example copy and paste-able by Caio 10 years ago
-
Stop copying/creating centroids all the time 💬 by Caio 10 years ago
This patch makes all internal calls handle *centroid instead of the struct directy. With some crappy profiling in osx, a lot of time was spent allocating and moving memory around so this seemed like a very easy improvement to make. > $ go test -v ./... -bench . -run XXX > PASS > BenchmarkUpdate1-4 1000000 1364 ns/op > BenchmarkUpdate10-4 200000 11540 ns/op > BenchmarkUpdate100-4 30000 58091 ns/op > ok github.com/caio/go-tdigest 6.026s This is roughly an unscientific 300% improvement in the compression=100 case (which is what should generally be used, according to the original paper) - Ref: c6047295a555a07c80a203c9b64e20a423cb6fed.
-
Prevent bad centroid insertion by Caio 10 years ago
-
Lint: Remove `= 0` from declarations 💬 by Caio 10 years ago
Originally removed on c29cee2ee58306cce780d53be2e0b7ac2666403f, and lost during the merge resolution of other pull requests.
-
Remove unneeded TestGoRoutineLeak by Caio 10 years ago
-
Use a compression of 100 in TestUniformDistribution 💬 by Caio 10 years ago
The thresholds used with assertDifferenceSmallerThan were tuned for a compression of 100, not 10.
-
Remove unused compareCentroids function 💬 by Caio 10 years ago
Coverage++ :x
-
Move everything over to a closure-based iteration model 💬 by Damian Gryski 10 years ago
Cherry-pick dgryski/no-goroutines and resolve the conflicts
-
use varint routines from encoding/binary 💬 by Damian Gryski 10 years ago
Cherry pick of `dgryski/varint` fixing the merge conflicts
-
Merge pull request #2 from dgryski/lint-fixes 💬 by Caio 10 years ago
lint fixes
-
golint fixes by Damian Gryski 10 years ago
-
Tidy up the serialization "constants" by Caio 10 years ago
-
Unexpose the Summary struct 💬 by Caio 10 years ago
gorename <3
-
Unexpose the Centroid struct (rename it to centroid) by Caio 10 years ago
-
Add some docs 💬 by Caio 10 years ago
Because why not.
-
Get rid of `InvalidCentroid` by Caio 10 years ago
-
Add very simple `Update()` benchmarks 💬 by Caio 10 years ago
> $ go test -run XXX -bench . > PASS > BenchmarkUpdate1-4 200000 10278 ns/op > BenchmarkUpdate10-4 50000 35964 ns/op > BenchmarkUpdate100-4 10000 157862 ns/op > ok github.com/caio/go-tdigest 5.791s